A Additional Related Work
–Neural Information Processing Systems
In this section we provide further discussion of the related works. The convergence of FedAvg, also known as Local SGD, has been the subject of intense study in recent years due to the algorithm's effectiveness combined with the difficulties of analyzing it. In homogeneous data settings, local updates are easier to reconcile with solving the global objective, allowing much progress to be made in understanding convergence rates in this case [2-4, 62-66]. In the heterogeneous case multiple works have shown that FedAvg with fixed learning rate may not solve the global objective because the local updates induce a non-vanishing bias by drifting towards local solutions, even with full gradient steps and and strongly convex objectives [5-9, 16, 20, 67, 68]. As a remedy, several papers have analyzed FedAvg with learning rate that decays over communication rounds, and have shown that this approach indeed reaches a stationary point of the global objective, but at sublinear rates [5, 14-17] that can be strictly slower than the convergence rates of D-SGD [5, 18].
Neural Information Processing Systems
May-29-2025, 14:36:49 GMT
- Technology: