Statistical Guarantees for High-Dimensional Stochastic Gradient Descent
Li, Jiaqi, Lou, Zhipeng, Schmidt-Hieber, Johannes, Wu, Wei Biao
Stochastic Gradient Descent (SGD) and its Ruppert-Polyak averaged variant (ASGD) lie at the heart of modern large-scale learning, yet their theoretical properties in high-dimensional settings are rarely understood. In this paper, we provide rigorous statistical guarantees for constant learning-rate SGD and ASGD in high-dimensional regimes. Our key innovation is to transfer powerful tools from high-dimensional time series to online learning. Specifically, by viewing SGD as a nonlinear autoregressive process and adapting existing coupling techniques, we prove the geometric-moment contraction of high-dimensional SGD for constant learning rates, thereby establishing asymptotic stationarity of the iterates. Building on this, we derive the $q$-th moment convergence of SGD and ASGD for any $q\ge2$ in general $\ell^s$-norms, and, in particular, the $\ell^{\infty}$-norm that is frequently adopted in high-dimensional sparse or structured models. Furthermore, we provide sharp high-probability concentration analysis which entails the probabilistic bound of high-dimensional ASGD. Beyond closing a critical gap in SGD theory, our proposed framework offers a novel toolkit for analyzing a broad class of high-dimensional learning algorithms.
Oct-15-2025
- Country:
- North America
- United States
- Illinois > Cook County
- Chicago (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- California > San Diego County
- Illinois > Cook County
- Canada > Quebec
- Montreal (0.04)
- United States
- Europe
- Netherlands (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Asia
- Middle East > Jordan (0.04)
- Singapore (0.04)
- North America
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Education > Educational Setting > Online (0.34)
- Technology: