Stochastic Cubic Regularization for Fast Nonconvex Optimization

Tripuraneni, Nilesh, Stern, Mitchell, Jin, Chi, Regier, Jeffrey, Jordan, Michael I.

Dec-5-2017–arXiv.org Machine Learning

In this setting, we only have access to the stochastic function f(x; ξ), where the random variable ξ is sampled from an underlying distribution D. The task is to optimize the expected function f(x), which in general may be nonconvex. This framework covers a wide range of problems, including the offline setting where we minimize the empirical loss over a fixed amount of data, and the online setting where data arrives sequentially. One of the most prominent applications of stochastic optimization has been in large-scale statistics and machine learning problems, such as the optimization of deep neural networks. Classical analysis in nonconvex optimization only guarantees convergence to a first-order stationary point (i.e., a point x satisfying ‖ f(x)‖ 0), which can be a local minimum, a local maximum, or a saddle point. This paper goes further, proposing an algorithm that escapes saddle points and converges to a local minimum.

algorithm 1, gradient descent, stationary point, (10 more...)

arXiv.org Machine Learning

Dec-5-2017

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California > Alameda County > Berkeley (0.04)
- Asia > Middle East
  - Jordan (0.04)
- Africa > Middle East
  - Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre:
- Research Report (0.82)

Industry:
- Education (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found