High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes

Jagannath, Aukosh, Jones-McCormick, Taj, Sarangian, Varnan

Nov-7-2025–arXiv.org Machine Learning

We develop a high-dimensional scaling limit for Stochastic Gradient Descent with Polyak Momentum (SGD-M) and adaptive step-sizes. This provides a framework to rigourously compare online SGD with some of its popular variants. We show that the scaling limits of SGD-M coincide with those of online SGD after an appropriate time rescaling and a specific choice of step-size. However, if the step-size is kept the same between the two algorithms, SGD-M will amplify high-dimensional effects, potentially degrading performance relative to online SGD. We demonstrate our framework on two popular learning problems: Spiked Tensor PCA and Single Index Models. In both cases, we also examine online SGD with an adaptive step-size based on normalized gradients. In the high-dimensional regime, this algorithm yields multiple benefits: its dynamics admit fixed points closer to the population minimum and widens the range of admissible step-sizes for which the iterates converge to such solutions. These examples provide a rigorous account, aligning with empirical motivation, of how early preconditioners can stabilize and improve dynamics in settings where online SGD fails.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

Nov-7-2025

arXiv.org PDF

Add feedback

Country:
- North America > Canada
  - Ontario > Waterloo Region > Waterloo (0.04)
- Europe
  - Russia (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Czechia > South Moravian Region
    - Brno (0.04)
- Asia
  - Russia (0.04)
  - Middle East > Jordan (0.04)
- Africa > Middle East
  - Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre:
- Research Report (0.50)

Industry:
- Education (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning > Gradient Descent (0.54)
  - Neural Networks > Deep Learning (0.45)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found