A Additional Related Work

May-29-2025, 14:36:49 GMT–Neural Information Processing Systems

In this section we provide further discussion of the related works. The convergence of FedAvg, also known as Local SGD, has been the subject of intense study in recent years due to the algorithm's effectiveness combined with the difficulties of analyzing it. In homogeneous data settings, local updates are easier to reconcile with solving the global objective, allowing much progress to be made in understanding convergence rates in this case [2-4, 62-66]. In the heterogeneous case multiple works have shown that FedAvg with fixed learning rate may not solve the global objective because the local updates induce a non-vanishing bias by drifting towards local solutions, even with full gradient steps and and strongly convex objectives [5-9, 16, 20, 67, 68]. As a remedy, several papers have analyzed FedAvg with learning rate that decays over communication rounds, and have shown that this approach indeed reaches a stationary point of the global objective, but at sublinear rates [5, 14-17] that can be strictly slower than the convergence rates of D-SGD [5, 18].

artificial intelligence, dist, machine learning, (17 more...)

Neural Information Processing Systems

May-29-2025, 14:36:49 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)