* 7 pages, 3 figures, extended version of the paper presented to ICMLA'13

**Click to Read Paper**

We propose a new stochastic coordinate descent method for minimizing the sum of convex functions each of which depends on a small number of coordinates only. Our method (APPROX) is simultaneously Accelerated, Parallel and PROXimal; this is the first time such a method is proposed. In the special case when the number of processors is equal to the number of coordinates, the method converges at the rate $2\bar{\omega}\bar{L} R^2/(k+1)^2 $, where $k$ is the iteration counter, $\bar{\omega}$ is an average degree of separability of the loss function, $\bar{L}$ is the average of Lipschitz constants associated with the coordinates and individual functions in the sum, and $R$ is the distance of the initial point from the minimizer. We show that the method can be implemented without the need to perform full-dimensional vector operations, which is the major bottleneck of existing accelerated coordinate descent methods. The fact that the method depends on the average degree of separability, and not on the maximum degree of separability, can be attributed to the use of new safe large stepsizes, leading to improved expected separable overapproximation (ESO). These are of independent interest and can be utilized in all existing parallel stochastic coordinate descent algorithms based on the concept of ESO.

* 25 pages, 2 algorithms, 6 tables, 3 figures

* 25 pages, 2 algorithms, 6 tables, 3 figures

**Click to Read Paper**
Smooth minimization of nonsmooth functions with parallel coordinate descent methods

Sep 23, 2013

Olivier Fercoq, Peter Richtárik

Sep 23, 2013

Olivier Fercoq, Peter Richtárik

* 39 pages, 1 algorithm, 3 figures, 2 tables

**Click to Read Paper**

Mind the duality gap: safer rules for the Lasso

Dec 03, 2015

Olivier Fercoq, Alexandre Gramfort, Joseph Salmon

Screening rules allow to early discard irrelevant variables from the optimization in Lasso problems, or its derivatives, making solvers faster. In this paper, we propose new versions of the so-called $\textit{safe rules}$ for the Lasso. Based on duality gap considerations, our new rules create safe test regions whose diameters converge to zero, provided that one relies on a converging solver. This property helps screening out more variables, for a wider range of regularization parameter values. In addition to faster convergence, we prove that we correctly identify the active sets (supports) of the solutions in finite time. While our proposed strategy can cope with any solver, its performance is demonstrated using a coordinate descent algorithm particularly adapted to machine learning use cases. Significant computing time reductions are obtained with respect to previous safe rules.
Dec 03, 2015

Olivier Fercoq, Alexandre Gramfort, Joseph Salmon

* erratum to ICML 2015, "The authors would like to thanks Jalal Fadili and Jingwei Liang for helping clarifying some misleading statements on the equicorrelation set"

**Click to Read Paper**

Improving Evolutionary Strategies with Generative Neural Networks

Jan 31, 2019

Louis Faury, Clement Calauzenes, Olivier Fercoq, Syrine Krichen

Jan 31, 2019

Louis Faury, Clement Calauzenes, Olivier Fercoq, Syrine Krichen

**Click to Read Paper**

Stochastic Conditional Gradient Method for Composite Convex Minimization

Jan 29, 2019

Francesco Locatello, Alp Yurtsever, Olivier Fercoq, Volkan Cevher

In this paper, we propose the first practical algorithm to minimize stochastic composite optimization problems over compact convex sets. This template allows for affine constraints and therefore covers stochastic semidefinite programs (SDPs), which are vastly applicable in both machine learning and statistics. In this setup, stochastic algorithms with convergence guarantees are either not known or not tractable. We tackle this general problem and propose a convergent, easy to implement and tractable algorithm. We prove $\mathcal{O}(k^{-1/3})$ convergence rate in expectation on the objective residual and $\mathcal{O}(k^{-5/12})$ in expectation on the feasibility gap. These rates are achieved without increasing the batchsize, which can contain a single sample. We present extensive empirical evidence demonstrating the superiority of our algorithm on a broad range of applications including optimization of stochastic SDPs.
Jan 29, 2019

Francesco Locatello, Alp Yurtsever, Olivier Fercoq, Volkan Cevher

**Click to Read Paper**

Neural Generative Models for Global Optimization with Gradients

Jun 14, 2018

Louis Faury, Flavian Vasile, Clément Calauzènes, Olivier Fercoq

Jun 14, 2018

Louis Faury, Flavian Vasile, Clément Calauzènes, Olivier Fercoq

**Click to Read Paper**

Gap Safe screening rules for sparsity enforcing penalties

Dec 27, 2017

Eugene Ndiaye, Olivier Fercoq, Alexandre Gramfort, Joseph Salmon

Dec 27, 2017

Eugene Ndiaye, Olivier Fercoq, Alexandre Gramfort, Joseph Salmon

**Click to Read Paper**

Generalized Concomitant Multi-Task Lasso for sparse multimodal regression

Oct 18, 2017

Mathurin Massias, Olivier Fercoq, Alexandre Gramfort, Joseph Salmon

In high dimension, it is customary to consider Lasso-type estimators to enforce sparsity. For standard Lasso theory to hold, the regularization parameter should be proportional to the noise level, yet the latter is generally unknown in practice. A possible remedy is to consider estimators, such as the Concomitant/Scaled Lasso, which jointly optimize over the regression coefficients as well as over the noise level, making the choice of the regularization independent of the noise level. However, when data from different sources are pooled to increase sample size, or when dealing with multimodal datasets, noise levels typically differ and new dedicated estimators are needed. In this work we provide new statistical and computational solutions to deal with such heteroscedastic regression models, with an emphasis on functional brain imaging with combined magneto- and electroencephalographic (M/EEG) signals. Adopting the formulation of Concomitant Lasso-type estimators, we propose a jointly convex formulation to estimate both the regression coefficients and the (square root of the) noise covariance. When our framework is instantiated to de-correlated noise, it leads to an efficient algorithm whose computational cost is not higher than for the Lasso and Concomitant Lasso, while addressing more complex noise structures. Numerical experiments demonstrate that our estimator yields improved prediction and support identification while correctly estimating the noise (square root) covariance. Results on multimodal neuroimaging problems with M/EEG data are also reported.
Oct 18, 2017

Mathurin Massias, Olivier Fercoq, Alexandre Gramfort, Joseph Salmon

**Click to Read Paper**

GAP Safe Screening Rules for Sparse-Group-Lasso

Feb 19, 2016

Eugene Ndiaye, Olivier Fercoq, Alexandre Gramfort, Joseph Salmon

Feb 19, 2016

Eugene Ndiaye, Olivier Fercoq, Alexandre Gramfort, Joseph Salmon

**Click to Read Paper**

GAP Safe screening rules for sparse multi-task and multi-class models

Nov 18, 2015

Eugene Ndiaye, Olivier Fercoq, Alexandre Gramfort, Joseph Salmon

Nov 18, 2015

Eugene Ndiaye, Olivier Fercoq, Alexandre Gramfort, Joseph Salmon

* in Proceedings of the 29-th Conference on Neural Information Processing Systems (NIPS), 2015

**Click to Read Paper**

SDNA: Stochastic Dual Newton Ascent for Empirical Risk Minimization

Feb 08, 2015

Zheng Qu, Peter Richtárik, Martin Takáč, Olivier Fercoq

Feb 08, 2015

Zheng Qu, Peter Richtárik, Martin Takáč, Olivier Fercoq

**Click to Read Paper**

Fast Distributed Coordinate Descent for Non-Strongly Convex Losses

Jul 27, 2014

Olivier Fercoq, Zheng Qu, Peter Richtárik, Martin Takáč

Jul 27, 2014

Olivier Fercoq, Zheng Qu, Peter Richtárik, Martin Takáč

**Click to Read Paper**

Smooth Primal-Dual Coordinate Descent Algorithms for Nonsmooth Convex Optimization

Nov 09, 2017

Ahmet Alacaoglu, Quoc Tran-Dinh, Olivier Fercoq, Volkan Cevher

We propose a new randomized coordinate descent method for a convex optimization template with broad applications. Our analysis relies on a novel combination of four ideas applied to the primal-dual gap function: smoothing, acceleration, homotopy, and coordinate descent with non-uniform sampling. As a result, our method features the first convergence rate guarantees among the coordinate descent methods, that are the best-known under a variety of common structure assumptions on the template. We provide numerical evidence to support the theoretical results with a comparison to state-of-the-art algorithms.
Nov 09, 2017

Ahmet Alacaoglu, Quoc Tran-Dinh, Olivier Fercoq, Volkan Cevher

* NIPS 2017

**Click to Read Paper**

Safe Grid Search with Optimal Complexity

Oct 12, 2018

Eugene Ndiaye, Tam Le, Olivier Fercoq, Joseph Salmon, Ichiro Takeuchi

Oct 12, 2018

Eugene Ndiaye, Tam Le, Olivier Fercoq, Joseph Salmon, Ichiro Takeuchi

**Click to Read Paper**

Efficient Smoothed Concomitant Lasso Estimation for High Dimensional Regression

Jun 08, 2016

Eugene Ndiaye, Olivier Fercoq, Alexandre Gramfort, Vincent Leclère, Joseph Salmon

Jun 08, 2016

Eugene Ndiaye, Olivier Fercoq, Alexandre Gramfort, Vincent Leclère, Joseph Salmon

**Click to Read Paper**