Gradient Descent Algorithm Survey

Fucheng, Deng, Wanjie, Wang, Ao, Gong, Xiaoqi, Wang, Fan, Wang

Nov-27-2025–arXiv.org Artificial Intelligence

Its simple update, linear scalability with sample size, and compatibility with momentum, mini-batching, and learning-rate heuristics keep it dominant in both industry and academia. Current research continues to refine convergence rates, variance characterizations, and averaging schemes, while engineering efforts focus on hardware-aligned and distributed variants. B. Mini-Batch Stochastic Gradient Descent 1) Background and Development: Batch Gradient Descent (BGD) requires computing the gradient using the entire training dataset at each iteration. As dataset sizes expand to millions or even larger scales, the computational cost of a single iteration becomes extremely high, making it unsuitable for large-scale learning tasks. The convergence of SGD was proven by Robbins and Monro through the stochastic approximation method [1]. SGD uses one sample to update the gradient at each step, resulting in low computational cost but high gradient variance and unstable updates. The mini-batch strategy has gradually become the mainstream in practice, especially with the rise of large-scale machine learning and deep learning. Bottou emphasized the practical value of mini-batches in his research on large-scale learning [5], while systematic monographs and reviews on deep learning have further standardized this approach [6], [7]. Mini-batch SGD achieves an optimal balance between stability, high-frequency updates, and GPU parallel acceleration [2].

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Nov-27-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning
    - Statistical Learning > Gradient Descent (1.00)
    - Neural Networks > Deep Learning (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found