Goto

Collaborating Authors

 gap safe rule




GAP Safe screening rules for sparse multi-task and multi-class models

Neural Information Processing Systems

Screening rules leverage the known sparsity of the solution by ignoring some variables in the optimization, hence speeding up solvers. When the procedure is proven not to discard features wrongly the rules are said to be safe.


GAP Safe Screening Rules for Sparse-Group Lasso

Neural Information Processing Systems

For statistical learning in high dimension, sparse regularizations have proven useful to boost both computational and statistical efficiency. In some contexts, it is natural to handle more refined structures than pure sparsity, such as for instance group sparsity. Sparse-Group Lasso has recently been introduced in the context of linear regression to enforce sparsity both at the feature and at the group level. We propose the first (provably) safe screening rules for Sparse-Group Lasso, i.e., rules that allow to discard early in the solver features/groups that are inactive at optimal solution. Thanks to efficient dual gap computations relying on the geometric properties of ɛ-norm, safe screening rules for Sparse-Group Lasso lead to significant gains in term of computing time for our coordinate descent implementation.


Dual Extrapolation for Faster Lasso Solvers

Massias, Mathurin, Gramfort, Alexandre, Salmon, Joseph

arXiv.org Machine Learning

Convex sparsity-inducing regularizations are ubiquitous in high-dimension machine learning, but their non-differentiability requires the use of iterative solvers. To accelerate such solvers, state-of-the-art approaches consist in reducing the size of the optimization problem at hand. In the context of regression, this can be achieved either by discarding irrelevant features (screening techniques) or by prioritizing features likely to be included in the support of the solution (working set techniques). Duality comes into play at several steps in these techniques. Here, we propose an extrapolation technique starting from a sequence of iterates in the dual that leads to the construction of an improved dual point. This enables a tighter control of optimality as used in stopping criterion, as well as better screening performance of Gap Safe rules. Finally, we propose a working set strategy based on an aggressive use of Gap Safe rules and our new dual point construction, which improves state-of-the-art time performance on Lasso problems.


Gap Safe screening rules for sparsity enforcing penalties

Ndiaye, Eugene, Fercoq, Olivier, Gramfort, Alexandre, Salmon, Joseph

arXiv.org Machine Learning

In high dimensional regression settings, sparsity enforcing penalties have proved useful to regularize the data-fitting term. A recently introduced technique called screening rules propose to ignore some variables in the optimization leveraging the expected sparsity of the solutions and consequently leading to faster solvers. When the procedure is guaranteed not to discard variables wrongly the rules are said to be safe. In this work, we propose a unifying framework for generalized linear models regularized with standard sparsity enforcing penalties such as $\ell_1$ or $\ell_1/\ell_2$ norms. Our technique allows to discard safely more variables than previously considered safe rules, particularly for low regularization parameters. Our proposed Gap Safe rules (so called because they rely on duality gap computation) can cope with any iterative solver but are particularly well suited to (block) coordinate descent methods. Applied to many standard learning tasks, Lasso, Sparse-Group Lasso, multi-task Lasso, binary and multinomial logistic regression, etc., we report significant speed-ups compared to previously proposed safe rules on all tested data sets.


GAP Safe Screening Rules for Sparse-Group Lasso

Ndiaye, Eugene, Fercoq, Olivier, Gramfort, Alexandre, Salmon, Joseph

Neural Information Processing Systems

For statistical learning in high dimension, sparse regularizations have proven useful to boost both computational and statistical efficiency. In some contexts, it is natural to handle more refined structures than pure sparsity, such as for instance group sparsity. Sparse-Group Lasso has recently been introduced in the context of linear regression to enforce sparsity both at the feature and at the group level. We propose the first (provably) safe screening rules for Sparse-Group Lasso, i.e., rules that allow to discard early in the solver features/groups that are inactive at optimal solution. Thanks to efficient dual gap computations relying on the geometric properties of $\epsilon$-norm, safe screening rules for Sparse-Group Lasso lead to significant gains in term of computing time for our coordinate descent implementation.


GAP Safe Screening Rules for Sparse-Group-Lasso

Ndiaye, Eugene, Fercoq, Olivier, Gramfort, Alexandre, Salmon, Joseph

arXiv.org Machine Learning

In high dimensional settings, sparse structures are crucial for efficiency, either in term of memory, computation or performance. In some contexts, it is natural to handle more refined structures than pure sparsity, such as for instance group sparsity. Sparse-Group Lasso has recently been introduced in the context of linear regression to enforce sparsity both at the feature level and at the group level. We adapt to the case of Sparse-Group Lasso recent safe screening rules that discard early in the solver irrelevant features/groups. Such rules have led to important speed-ups for a wide range of iterative methods. Thanks to dual gap computations, we provide new safe screening rules for Sparse-Group Lasso and show significant gains in term of computing time for a coordinate descent implementation.


GAP Safe screening rules for sparse multi-task and multi-class models

Ndiaye, Eugene, Fercoq, Olivier, Gramfort, Alexandre, Salmon, Joseph

Neural Information Processing Systems

High dimensional regression benefits from sparsity promoting regularizations. Screening rules leverage the known sparsity of the solution by ignoring some variables in the optimization, hence speeding up solvers. When the procedure is proven not to discard features wrongly the rules are said to be safe. In this paper we derive new safe rules for generalized linear models regularized with L1 and L1/L2 norms. The rules are based on duality gap computations and spherical safe regions whose diameters converge to zero. This allows to discard safely more variables, in particular for low regularization parameters. The GAP Safe rule can cope with any iterative solver and we illustrate its performance on coordinate descent for multi-task Lasso, binary and multinomial logistic regression, demonstrating significant speed ups on all tested datasets with respect to previous safe rules.


GAP Safe screening rules for sparse multi-task and multi-class models

Ndiaye, Eugene, Fercoq, Olivier, Gramfort, Alexandre, Salmon, Joseph

arXiv.org Machine Learning

High dimensional regression benefits from sparsity promoting regularizations. Screening rules leverage the known sparsity of the solution by ignoring some variables in the optimization, hence speeding up solvers. When the procedure is proven not to discard features wrongly the rules are said to be \emph{safe}. In this paper we derive new safe rules for generalized linear models regularized with $\ell_1$ and $\ell_1/\ell_2$ norms. The rules are based on duality gap computations and spherical safe regions whose diameters converge to zero. This allows to discard safely more variables, in particular for low regularization parameters. The GAP Safe rule can cope with any iterative solver and we illustrate its performance on coordinate descent for multi-task Lasso, binary and multinomial logistic regression, demonstrating significant speed ups on all tested datasets with respect to previous safe rules.