Reviews: Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Oct-8-2024, 10:28:12 GMT–Neural Information Processing Systems

The fundamental claim [line 101 & 239] is that asymptotically, for streaming PCA, the delay tau is allowed to scale as (1 - mu) 2 / sqrt(eta), where mu is the step size and mu the momentum parameter. Major Comments Before we discuss the proof, I think the introduction is somewhat misleading. In line 76, the authors point out previous work all focus on analyzing convergence to a first order optimal solution. The readers can be confused that this paper improved the results of previous work. However, the problems studies in those paper and streaming PCA are different.

acceleration tradeoff, momentum and asynchrony, nonconvex stochastic optimization, (8 more...)

Neural Information Processing Systems

Oct-8-2024, 10:28:12 GMT

Conferences Web Page

Add feedback

Genre:
- Summary/Review (0.37)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)