Goto

Collaborating Authors

 Asia



1e5cff01121223de917a84a242de30a5-Paper-Conference.pdf

Neural Information Processing Systems

InOrMo, momentum isincorporated into ASGD byorganizing the gradients in order based on their iteration indexes. We theoretically prove the convergence of OrMo with both constant and delay-adaptive learning rates for non-convexproblems.



1d8dc55c1f6cf124af840ce1d92d1896-Paper-Conference.pdf

Neural Information Processing Systems

As inthe classical problem, weights are fixed by an adversary and elements appear in random order. In contrast to previous variants of predictions, our algorithm only has access toamuch weakerpiece ofinformation: anadditive gapc.