SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points

Oct-10-2024, 17:49:27 GMT–Neural Information Processing Systems

We analyze stochastic gradient algorithms for optimizing nonconvex problems. In particular, our goal is to find local minima (second-order stationary points) instead of just finding first-order stationary points which may be some bad unstable saddle points. We show that a simple perturbed version of stochastic recursive gradient descent algorithm (called SSRGD) can find an (\epsilon,\delta) -second-order stationary point with \widetilde{O}(\sqrt{n}/\epsilon 2 \sqrt{n}/\delta 4 n/\delta 3) stochastic gradient complexity for nonconvex finite-sum problems. As a by-product, SSRGD finds an \epsilon -first-order stationary point with O(n \sqrt{n}/\epsilon 2) stochastic gradients. These results are almost optimal since Fang et al. [2018] provided a lower bound \Omega(\sqrt{n}/\epsilon 2) for finding even just an \epsilon -first-order stationary point.

algorithm, simple stochastic recursive gradient descent, stationary point, (9 more...)

Neural Information Processing Systems

Oct-10-2024, 17:49:27 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)