Stochastic Cubic Regularization for Fast Nonconvex Optimization

Tripuraneni, Nilesh, Stern, Mitchell, Jin, Chi, Regier, Jeffrey, Jordan, Michael I.

arXiv.org Machine Learning 

In this setting, we only have access to the stochastic function f(x; ξ), where the random variable ξ is sampled from an underlying distribution D. The task is to optimize the expected function f(x), which in general may be nonconvex. This framework covers a wide range of problems, including the offline setting where we minimize the empirical loss over a fixed amount of data, and the online setting where data arrives sequentially. One of the most prominent applications of stochastic optimization has been in large-scale statistics and machine learning problems, such as the optimization of deep neural networks. Classical analysis in nonconvex optimization only guarantees convergence to a first-order stationary point (i.e., a point x satisfying ‖ f(x)‖ 0), which can be a local minimum, a local maximum, or a saddle point. This paper goes further, proposing an algorithm that escapes saddle points and converges to a local minimum.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found