simple stochastic recursive gradient descent
- Asia > Middle East > Jordan (0.04)
- North America > United States > California (0.04)
- North America > Canada (0.04)
- (2 more...)
- Asia > Middle East > Jordan (0.04)
- North America > United States > California (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (2 more...)
Reviews: SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points
After response----I keep my evaluation on the technical innovation and suboptimality of this paper. The basic Spider and SpiderBoost algorithms are both for first-order stationary point, they are almost the same, and both give n {1/2} rate. The simple way to modify both algorithms to escape saddle point is to add Negative Curvature Search (NCS) subroutine (which can be done in a very modular way, and is already shown in the Spider paper). I'd say it's almost trivial to also show SpiderBoost NCS to find second-order stationary point with n {1/2} rate. Comparing this paper with SpiderBoost NCS, there's no improvement from n {2/3} to n {1/2} (since Spiderboost is already n {1/2}), no simplification of Spider (as Spiderboost already did so). The only difference is replacing NCS by perturbations, which again requires some work, but most techniques are already there.
Reviews: SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points
After extensive back and forth discussion by the reviewers, ultimately we felt that despite trivial solutions that are possible by mixing spider with negative curvature search, the approach of the present paper has its usefulness (as noted in the reviews too), and that the paper can be accepted. The authors are, however, strongly encouraged to look into the reviews carefully, and implement the points mentioned in the rebuttal -- because it would be a pity if after all the back-n-forth that this paper's review cycle has witnessed, if the authors did not update the paper to clarify its contribution, its value, and its contrasts with less implementable methods.
SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points
We analyze stochastic gradient algorithms for optimizing nonconvex problems. In particular, our goal is to find local minima (second-order stationary points) instead of just finding first-order stationary points which may be some bad unstable saddle points. We show that a simple perturbed version of stochastic recursive gradient descent algorithm (called SSRGD) can find an (\epsilon,\delta) -second-order stationary point with \widetilde{O}(\sqrt{n}/\epsilon 2 \sqrt{n}/\delta 4 n/\delta 3) stochastic gradient complexity for nonconvex finite-sum problems. As a by-product, SSRGD finds an \epsilon -first-order stationary point with O(n \sqrt{n}/\epsilon 2) stochastic gradients. These results are almost optimal since Fang et al. [2018] provided a lower bound \Omega(\sqrt{n}/\epsilon 2) for finding even just an \epsilon -first-order stationary point.
SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points
We analyze stochastic gradient algorithms for optimizing nonconvex problems. In particular, our goal is to find local minima (second-order stationary points) instead of just finding first-order stationary points which may be some bad unstable saddle points. We show that a simple perturbed version of stochastic recursive gradient descent algorithm (called SSRGD) can find an $(\epsilon,\delta)$-second-order stationary point with $\widetilde{O}(\sqrt{n}/\epsilon 2 \sqrt{n}/\delta 4 n/\delta 3)$ stochastic gradient complexity for nonconvex finite-sum problems. As a by-product, SSRGD finds an $\epsilon$-first-order stationary point with $O(n \sqrt{n}/\epsilon 2)$ stochastic gradients. These results are almost optimal since Fang et al. [2018] provided a lower bound $\Omega(\sqrt{n}/\epsilon 2)$ for finding even just an $\epsilon$-first-order stationary point.
- Transportation > Passenger (0.40)
- Transportation > Ground > Road (0.40)
- Automobiles & Trucks > Manufacturer (0.40)