1e5cff01121223de917a84a242de30a5-Paper-Conference.pdf
–Neural Information Processing Systems
InOrMo, momentum isincorporated into ASGD byorganizing the gradients in order based on their iteration indexes. We theoretically prove the convergence of OrMo with both constant and delay-adaptive learning rates for non-convexproblems.
Neural Information Processing Systems
Feb-19-2026, 03:53:15 GMT