Understanding the Role of Momentum in Stochastic Gradient Methods
Igor Gitman, Hunter Lang, Pengchuan Zhang, Lin Xiao
–Neural Information Processing Systems
The use of momentum in stochastic gradient methods has become a widespread practice in machine learning. Different variants of momentum, including heavy-ball momentum, Nesterov's accelerated gradient (NAG), and quasi-hyperbolic momentum (QHM), have demonstrated success on various tasks.
Neural Information Processing Systems
Oct-2-2025, 17:02:50 GMT