Zaccone, Riccardo
Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum
Zaccone, Riccardo, Masone, Carlo, Ciccone, Marco
Federated Learning (FL) is the state-of-the-art approach for learning from decentralized data in privacy-constrained scenarios. As the current literature reports, the main problems associated with FL refer to system and statistical challenges: the former ones demand for efficient learning from edge devices, including lowering communication bandwidth and frequency, while the latter require algorithms robust to non-iidness. State-of-art approaches either guarantee convergence at increased communication cost or are not sufficiently robust to handle extreme heterogeneous local distributions. In this work we propose a novel generalization of the heavy-ball momentum, and present FedHBM to effectively address statistical heterogeneity in FL without introducing any communication overhead. We conduct extensive experimentation on common FL vision and NLP datasets, showing that our FedHBM algorithm empirically yields better model quality and higher convergence speed w.r.t. the state-of-art, especially in pathological non-iid scenarios. While being designed for cross-silo settings, we show how FedHBM is applicable in moderate-to-high cross-device scenarios, and how good model initializations (e.g. pre-training) can be exploited for prompt acceleration. Extended experimentation on large-scale real-world federated datasets further corroborates the effectiveness of our approach for real-world FL applications.
Speeding up Heterogeneous Federated Learning with Sequentially Trained Superclients
Zaccone, Riccardo, Rizzardi, Andrea, Caldarola, Debora, Ciccone, Marco, Caputo, Barbara
Abstract--Federated Learning (FL) allows training machine learning models in privacy-constrained scenarios by enabling the cooperation of edge devices without requiring local data sharing. This approach raises several challenges due to the different statistical distribution of the local datasets and the clients' computational heterogeneity. As a solution, we propose FedSeq, a novel framework leveraging the sequential training of subgroups of heterogeneous clients, i.e. superclients, to emulate the centralized paradigm in a privacycompliant In this work, we tackle the problems of i) non identical class distribution, meaning that for a given pair instance-label In 2017, McMahan et al. [25] introduced Federated Learning (x, y) P Inspired by the differences with the standard centralized In FL, the clients are involved in an iterative two-step process training procedure, which bounds any FL algorithm, we introduce over several communication rounds: (i) independent training Federated Learning via Sequential Superclients Training on edge devices on local datasets, and (ii) aggregation of the (FedSeq), a novel algorithm that leverages sequential training updated models into a shared global one on the server-side. This approach is usually effective in homogeneous scenarios, We simulate the presence of homogeneous and larger datasets but fails to reach comparable performance against non-i.i.d. In particular, it has been shown that the non-iidness of distributions are grouped, forming a superclient based local datasets leads to unstable and slow convergence [23], on a dissimilarity metric.