Momentum Centering and Asynchronous Update for Adaptive Gradient Methods

Feb-11-2025, 02:00:49 GMT–Neural Information Processing Systems

We propose ACProp (Asynchronous-centering-Prop), an adaptive optimizer which combines centering of second momentum and asynchronous update (e.g. for t -th update, denominator uses information up to step t-1, while numerator uses gradient at t -th step). ACProp has both strong theoretical properties and empirical performance. With the example by Reddi et al. (2018), we show that asynchronous optimizers (e.g. AdaShift, ACProp) have weaker convergence condition than synchronous optimizers (e.g. Adam, RMSProp, AdaBelief); within asynchronous optimizers, we show that centering of second momentum further weakens the convergence condition.

acprop, momentum centering and asynchronous update, optimizer, (9 more...)

Neural Information Processing Systems

Feb-11-2025, 02:00:49 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence (0.40)