Gaussian Approximation and Multiplier Bootstrap for Stochastic Gradient Descent

Sheshukova, Marina, Samsonov, Sergey, Belomestny, Denis, Moulines, Eric, Shao, Qi-Man, Zhang, Zhuo-Song, Naumov, Alexey

Feb-10-2025–arXiv.org Machine Learning

In this paper, we establish non-asymptotic convergence rates in the central limit theorem for Polyak-Ruppert-averaged iterates of stochastic gradient descent (SGD). Our analysis builds on the result of the Gaussian approximation for nonlinear statistics of independent random variables of Shao and Zhang (2022). Using this result, we prove the non-asymptotic validity of the multiplier bootstrap for constructing the confidence sets for the optimal solution of an optimization problem. In particular, our approach avoids the need to approximate the limiting covariance of Polyak-Ruppert SGD iterates, which allows us to derive approximation rates in convex distance of order up to $1/\sqrt{n}$.

artificial intelligence, inequality, machine learning, (17 more...)

arXiv.org Machine Learning

Feb-10-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Statistical Learning
    - Gradient Descent (1.00)
  - Representation & Reasoning > Optimization (1.00)