Lower Bounds for Non-Convex Stochastic Optimization

Arjevani, Yossi, Carmon, Yair, Duchi, John C., Foster, Dylan J., Srebro, Nathan, Woodworth, Blake

Dec-4-2019–arXiv.org Machine Learning

We lower bound the complexity of finding $\epsilon$-stationary points (with gradient norm at most $\epsilon$) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance, we prove that (in the worst case) any algorithm requires at least $\epsilon^{-4}$ queries to find an $\epsilon$ stationary point. The lower bound is tight, and establishes that stochastic gradient descent is minimax optimal in this model. In a more restrictive model where the noisy gradient estimates satisfy a mean-squared smoothness property, we prove a lower bound of $\epsilon^{-3}$ queries, establishing the optimality of recently proposed variance reduction techniques.

algorithm, construction, optimization, (15 more...)

arXiv.org Machine Learning

Dec-4-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
  - California > Santa Clara County
    - Palo Alto (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.82)

Industry:
- Education (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Statistical Learning
    - Gradient Descent (0.75)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found