Scaffold with Stochastic Gradients: New Analysis with Linear Speed-Up

Mangold, Paul, Durmus, Alain, Dieuleveut, Aymeric, Moulines, Eric

Mar-10-2025–arXiv.org Machine Learning

This paper proposes a novel analysis for the Scaffold algorithm, a popular method for dealing with data heterogeneity in federated learning. While its convergence in deterministic settings--where local control variates mitigate client drift--is well established, the impact of stochastic gradient updates on its performance is less understood. To address this problem, we first show that its global parameters and control variates define a Markov chain that converges to a stationary distribution in the Wasserstein distance. Leveraging this result, we prove that Scaffold achieves linear speed-up in the number of clients up to higher-order terms in the step size. Nevertheless, our analysis reveals that Scaffold retains a higher-order bias, similar to FedAvg, that does not decrease as the number of clients increases. This highlights opportunities for developing improved stochastic federated learning algorithms

artificial intelligence, caffold, machine learning, (15 more...)

arXiv.org Machine Learning

Mar-10-2025

arXiv.org PDF

Add feedback

Country:
- Europe (0.14)
- North America > United States (0.14)

Genre:
- Research Report > New Finding (0.45)

Industry:
- Government (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)