Goto

Collaborating Authors

 connecting optimization and regularization path


Connecting Optimization and Regularization Paths

Neural Information Processing Systems

We study the implicit regularization properties of optimization techniques by explicitly connecting their optimization paths to the regularization paths of ``corresponding'' regularized problems. This surprising connection shows that iterates of optimization techniques such as gradient descent and mirror descent are \emph{pointwise} close to solutions of appropriately regularized objectives. While such a tight connection between optimization and regularization is of independent intellectual interest, it also has important implications for machine learning: we can port results from regularized estimators to optimization, and vice versa. We investigate one key consequence, that borrows from the well-studied analysis of regularized estimators, to then obtain tight excess risk bounds of the iterates generated by optimization techniques.


Connecting Optimization and Regularization Paths

Neural Information Processing Systems

We study the implicit regularization properties of optimization techniques by explicitly connecting their optimization paths to the regularization paths of corresponding'' regularized problems. This surprising connection shows that iterates of optimization techniques such as gradient descent and mirror descent are \emph{pointwise} close to solutions of appropriately regularized objectives. While such a tight connection between optimization and regularization is of independent intellectual interest, it also has important implications for machine learning: we can port results from regularized estimators to optimization, and vice versa. We investigate one key consequence, that borrows from the well-studied analysis of regularized estimators, to then obtain tight excess risk bounds of the iterates generated by optimization techniques.


Reviews: Connecting Optimization and Regularization Paths

Neural Information Processing Systems

The authors explore the relation between the trajectory of Gradient Descent (GD) initiated in the origin and the regularization path for l2-regularized minimization of the same objective. They first study the continuous-time setting where GD is replaced Gradient Flow, assuming that the objective is smooth and strongly convex. The main result (Theorem 1 whose proof I have verified) is as follows: under the appropriate scaling between the time t in GD and the inverse regularization parameter \eta, the two trajectories do not diverge much. This result is obtained by quantifying the shrinkage of the gradients as t and eta tend to infinity. In the continuous-time setting, the authors manage to reduce this task to formulating and solving certain ODEs.


Connecting Optimization and Regularization Paths

Suggala, Arun, Prasad, Adarsh, Ravikumar, Pradeep K.

Neural Information Processing Systems

We study the implicit regularization properties of optimization techniques by explicitly connecting their optimization paths to the regularization paths of corresponding'' regularized problems. This surprising connection shows that iterates of optimization techniques such as gradient descent and mirror descent are \emph{pointwise} close to solutions of appropriately regularized objectives. While such a tight connection between optimization and regularization is of independent intellectual interest, it also has important implications for machine learning: we can port results from regularized estimators to optimization, and vice versa. We investigate one key consequence, that borrows from the well-studied analysis of regularized estimators, to then obtain tight excess risk bounds of the iterates generated by optimization techniques. Papers published at the Neural Information Processing Systems Conference.