Sparse Random Networks for Communication-Efficient Federated Learning

Isik, Berivan, Pase, Francesco, Gunduz, Deniz, Weissman, Tsachy, Zorzi, Michele

arXiv.org Artificial Intelligence 

One main challenge in federated learning is the large communication cost of exchanging weight updates from clients to the server at each round. While prior work has made great progress in compressing the weight updates through gradient compression methods, we propose a radically different approach that does not update the weights at all. Instead, our method freezes the weights at their initial random values and learns how to sparsify the random network for the best performance. To this end, the clients collaborate in training a stochastic binary mask to find the optimal sparse random network within the original one. At the end of the training, the final model is a sparse network with random weights - or a subnetwork inside the dense random network. We show improvements in accuracy, communication (less than 1 bit per parameter (bpp)), convergence speed, and final model size (less than 1 bpp) over relevant baselines on MNIST, EMNIST, CIFAR-10, and CIFAR-100 datasets, in the low bitrate regime. Federated learning (FL) is a distributed learning framework where clients collaboratively train a model by performing local training on their data and by sharing their local updates with a server every few iterations, which in turn aggregates the local updates to create a global model, that is then transmitted to the clients for the next round of training. While being an appealing approach for enabling model training without the need to collect client data at the server, uplink communication of local updates is a significant bottleneck in FL (Kairouz et al., 2021). In this work, while aiming for communication efficiency in FL, we take a radically different approach from prior work, and propose a strategy that does not require communication of weight updates.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found