Reviews: A Universally Optimal Multistage Accelerated Stochastic Gradient Method
–Neural Information Processing Systems
This paper designs a multistage SGD algorithm that does not need to know noise and optimality gap at initialization and yet obtain optimal convergence rates. This is a well written paper with good results.
Neural Information Processing Systems
Jan-27-2025, 09:15:53 GMT