Delay-Tolerant Algorithms for Asynchronous Distributed Online Learning
Brendan McMahan, Matthew Streeter
–Neural Information Processing Systems
We analyze new online gradient descent algorithms for distributed systems with large delays between gradient computations and the corresponding updates. Using insights from adaptive gradient methods, we develop algorithms that adapt not only to the sequence of gradients, but also to the precise update delays that occur. We first give an impractical algorithm that achieves a regret bound that precisely quantifies the impact of the delays. We then analyze AdaptiveRevision, an algorithm that is efficiently implementable and achieves comparable guarantees. The key algorithmic technique is appropriately and efficiently revising the learning rate used for previous gradient steps. Experimental results show when the delays grow large (1000 updates or more), our new algorithms perform significantly better than standard adaptive gradient methods.
Neural Information Processing Systems
Feb-9-2025, 04:44:20 GMT
- Country:
- North America > United States
- Washington > King County
- Seattle (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Washington > King County
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Education > Educational Setting > Online (0.51)
- Technology: