Robust Gradient Descent via Heavy-Ball Momentum with Predictive Extrapolation

Dec-12-2025–arXiv.org Artificial Intelligence

Accelerated gradient methods like Nesterov's Accelerated Gradient (NAG) achieve faster convergence on well-conditioned problems but often diverge on ill-conditioned or non-convex landscapes due to aggressive momentum accumulation. We propose Heavy-Ball Synthetic Gradient Extrapolation (HB-SGE), a robust first-order method that combines heavy-ball momentum with predictive gradient extrapolation. Unlike classical momentum methods that accumulate historical gradients, HB-SGE estimates future gradient directions using local Taylor approximations, providing adaptive acceleration while maintaining stability. We prove convergence guarantees for strongly convex functions and demonstrate empirically that HB-SGE prevents divergence on problems where NAG and standard momentum fail. On ill-conditioned quadratics (condition number κ = 50), HB-SGE converges in 119 iterations while both SGD and NAG diverge. On the non-convex Rosen-brock function, HB-SGE achieves convergence in 2,718 iterations where classical momentum methods diverge within 10 steps. While NAG remains faster on well-conditioned problems, HB-SGE provides a robust alternative with speedup over SGD across diverse landscapes, requiring only O(d) memory overhead and the same hy-perparameters as standard momentum.

artificial intelligence, hb-sge, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Dec-12-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.64)

Industry:
- Leisure & Entertainment > Sports > Tennis (0.83)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Neural Networks (0.48)
    - Statistical Learning > Gradient Descent (0.41)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found