Gaussian Approximation and Multiplier Bootstrap for Stochastic Gradient Descent

Sheshukova, Marina, Samsonov, Sergey, Belomestny, Denis, Moulines, Eric, Shao, Qi-Man, Zhang, Zhuo-Song, Naumov, Alexey

arXiv.org Machine Learning 

In this paper, we establish non-asymptotic convergence rates in the central limit theorem for Polyak-Ruppert-averaged iterates of stochastic gradient descent (SGD). Our analysis builds on the result of the Gaussian approximation for nonlinear statistics of independent random variables of Shao and Zhang (2022). Using this result, we prove the non-asymptotic validity of the multiplier bootstrap for constructing the confidence sets for the optimal solution of an optimization problem. In particular, our approach avoids the need to approximate the limiting covariance of Polyak-Ruppert SGD iterates, which allows us to derive approximation rates in convex distance of order up to $1/\sqrt{n}$.