Full-Batch Gradient Descent Outperforms One-Pass SGD: Sample Complexity Separation in Single-Index Learning

Kovačević, Filip, Ji, Hong Chang, Wu, Denny, Soltanolkotabi, Mahdi, Mondelli, Marco

Feb-3-2026–arXiv.org Machine Learning

It is folklore that reusing training data more than once can improve the statistical efficiency of gradient-based learning. However, beyond linear regression, the theoretical advantage of full-batch gradient descent (GD, which always reuses all the data) over one-pass stochastic gradient descent (online SGD, which uses each data point only once) remains unclear. In this work, we consider learning a $d$-dimensional single-index model with a quadratic activation, for which it is known that one-pass SGD requires $n\gtrsim d\log d$ samples to achieve weak recovery. We first show that this $\log d$ factor in the sample complexity persists for full-batch spherical GD on the correlation loss; however, by simply truncating the activation, full-batch GD exhibits a favorable optimization landscape at $n \simeq d$ samples, thereby outperforming one-pass SGD (with the same activation) in statistical efficiency. We complement this result with a trajectory analysis of full-batch GD on the squared loss from small initialization, showing that $n \gtrsim d$ samples and $T \gtrsim\log d$ gradient steps suffice to achieve strong (exact) recovery.

artificial intelligence, machine learning, probability, (15 more...)

arXiv.org Machine Learning

Feb-3-2026

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California (0.14)
  - New York > New York County
    - New York City (0.04)
- Europe
  - Austria (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Jordan (0.04)
- Africa > Middle East
  - Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre:
- Research Report (0.81)

Industry:
- Government > Regional Government (0.45)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found