Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization

Liu, Tianyi, Li, Yan, Wei, Song, Zhou, Enlu, Zhao, Tuo

Feb-24-2021–arXiv.org Machine Learning

Nonconvex optimization has been widely adopted in various domains, including image recognition (Hinton et al., 2012; Krizhevsky et al., 2012), Bayesian graphical models (Jordan et al., 2004; Attias, 2000), recommendation systems (Salakhutdinov et al., 2007), etc. Despite the fact that solving a nonconvex problem is generally difficult, empirical evidences have shown that simple first order algorithms such as stochastic gradient descent (SGD), are able to solve a majority of the aforementioned nonconvex problems efficiently. The theory behind these empirical observations, however, is still largely unexplored. In classical optimization literature, there have been fruitful results on characterizing the convergence of SGD to first-order stationary points for nonconvex problems. However, these types of results fall short of explaining the empirical evidences that SGD often converges to global minima for a wide class of nonconvex problems used in practice. More recently, understanding the role of noise in the algorithmic behavior of SGD has received significant attention. For instance, Jin et al. (2017) show that a perturbed form of gradient descent is able to escape from strict saddle points and converge to second-order stationary points (i.e., local minima). Zhou et al. (2019) further show that noise in the update can help SGD to escape from spurious local minima and converge to the global minima.

artificial intelligence, machine learning, perturbed gd, (17 more...)

arXiv.org Machine Learning

Feb-24-2021

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East > Jordan (0.24)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found