OntheConvergenceofStepDecayStep-Sizefor StochasticOptimization

Feb-9-2026, 10:24:17 GMT–Neural Information Processing Systems

Step decay step-size schedules (constant and then cut) are widely used in practice because of their excellent convergence and generalization qualities, but their theoretical properties are not yet well understood. Weprovide convergence results for step decay in the non-convexregime, ensuring that the gradient norm vanishes at an O(lnT/ T)rate.

artificial intelligence, machine learning, xti, (15 more...)

Neural Information Processing Systems

Feb-9-2026, 10:24:17 GMT

Conferences PDF

Add feedback

Country:
- North America > Canada
  - Ontario > Toronto (0.04)
- Europe > Sweden
  - Stockholm > Stockholm (0.04)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (0.68)
  - Machine Learning > Statistical Learning (0.47)

Duplicate Docs Excel Report

Title
76c538125fc5c9ec6ad1d05650a57de5-Supplemental.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found