gradientdescent
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
- Asia > Middle East > Jordan (0.05)
- North America > Canada (0.04)
- North America > United States > Texas > Brazos County > College Station (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Asia > Middle East > Jordan (0.05)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- (4 more...)
- Banking & Finance (0.84)
- Media > Music (0.40)
Stochastic Variance Reduction for Nonconvex Optimization
Reddi, Sashank J., Hefny, Ahmed, Sra, Suvrit, Poczos, Barnabas, Smola, Alex
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them. SVRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (SGD); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary points) of SVRG for nonconvex optimization, and show that it is provably faster than SGD and gradient descent. We also analyze a subclass of nonconvex problems on which SVRG attains linear convergence to the global optimum. We extend our analysis to mini-batch variants of SVRG, showing (theoretical) linear speedup due to mini-batching in parallel settings.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.75)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)