STORM +: Fully Adaptive SGD with Momentum for Nonconvex Optimization
–Neural Information Processing Systems
The most popular approach to handling such problems is variance reduction techniques, which are also known to obtain tight convergence rates, matching the lower bounds in this case.
Neural Information Processing Systems
Aug-16-2025, 17:32:27 GMT