The Road Less Scheduled Aaron Defazio 1 Fundamental AI Research Team, Meta Xingyu (Alice) Y ang 2
–Neural Information Processing Systems
Recently, Zamani and Glineur (2023) and Defazio et al. (2023) showed that the exact worst-case Our approach uses an alternative form of momentum that replaces traditional momentum. So from this viewpoint, the Schedule-Free updates can be seen as a version of momentum that has the same immediate effect, but with a greater delay for adding in the remainder of the gradient.
Neural Information Processing Systems
Oct-9-2025, 18:59:32 GMT