Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization

Ge, Rong, Li, Zhize, Wang, Weiyao, Wang, Xiang

May-1-2019–arXiv.org Machine Learning

Nonconvex optimization is widely used in machine learning. Recently, for problems like matrix sensing (Bhojanapalli et al., 2016), matrix completion (Ge et al., 2016), and certain objectives for neural networks (Ge et al., 2017b), it was shown that all local minima are also globally optimal, therefore simple local search algorithms can be used to solve these problems. For a convex function f(x), a local and global minimum is achieved whenever the point has zero gradient: f(x) 0. However, for nonconvex functions, a point with zero gradient can also be a saddle point. To avoid converging to saddle points, recent results (Ge et al., 2015; Jin et al., 2017a,b) prove stronger results that show local search algorithms converge to ɛ-approximate second-order stationary points - points with small gradients and almost positive semi-definite Hessians (see Definition 1). In theory, Xu et al. (2018) and Allen-Zhu and Li (2017) independently showed that finding a second-order stationary point is not much harder than finding a first-order stationary point - they give reduction algorithms Neon/Neon2 that can converge to second-order stationary points when combined with algorithms that find first-order stationary points.

artificial intelligence, machine learning, probability, (19 more...)

arXiv.org Machine Learning

May-1-2019

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.49)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Representation & Reasoning > Search (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found