Balancing Client Participation in Federated Learning Using AoI

Javani, Alireza, Wang, Zhiying

arXiv.org Artificial Intelligence 

--Federated Learning (FL) offers a decentralized framework that preserves data privacy while enabling collaborative model training across distributed clients. However, FL faces significant challenges due to limited communication resources, statistical heterogeneity, and the need for balanced client participation. This paper proposes an Age of Information (AoI)- based client selection policy that addresses these challenges by minimizing load imbalance through controlled selection intervals. Our method employs a decentralized Markov scheduling policy, allowing clients to independently manage participation based on age-dependent selection probabilities, which balances client updates across training rounds with minimal central oversight. We provide a convergence proof for our method, demonstrating that it ensures stable and efficient model convergence. Specifically, we derive optimal parameters for the Markov selection model to achieve balanced and consistent client participation, highlighting the benefits of AoI in enhancing convergence stability. Through extensive simulations, we demonstrate that our AoI-based method, particularly the optimal Markov variant, improves convergence over the FedA vg selection approach across both IID and non-IID data settings by 7. 5% and up to 20%. Our findings underscore the effectiveness of AoI-based scheduling for scalable, fair, and efficient FL systems across diverse learning environments. Federated learning (FL), introduced by McMahan et al. [1], emerged as a solution to the limitations of traditional machine learning models that require centralized data collection and processing. FL enables client devices to collaboratively train a global model while keeping all the training data localized, thus addressing privacy concerns. Traditional machine-learning approaches require centralized data training in data centers, which often becomes impractical for edge devices due to privacy constraints in wireless networks and limited wireless communication resources. Federated learning overcomes these challenges by enabling devices to train machine learning models without data sharing and transmission, fulfilling the needs of data privacy and security. Compared to traditional distributed machine learning, Federated Learning introduces several challenges [2], including system heterogeneity from diverse device capabilities causing aggregation delays due to stragglers, statistical heterogeneity arising from non-IID and imbalanced client data affecting model convergence, and privacy concerns as exchanged model updates may inadvertently expose sensitive information. This paper is presented in part at the 2024 IEEE Global Communications Conference (Globecom). The authors are with the Center for Pervasive Communications and Computing, University of California, Irvine (e-mail: ajavani@uci.edu,