Differentially Private Over-the-Air Federated Learning Over MIMO Fading Channels

Liu, Hang, Yan, Jia, Zhang, Ying-Jun Angela

arXiv.org Artificial Intelligence 

--Federated learning (FL) enables edge devices to collaboratively train machine learning models, with model communication replacing direct data uploading. While over-the-air model aggregation improves communication efficiency, up-loading models to an edge server over wireless networks can pose privacy risks. Differential privacy (DP) is a widely used quantitative technique to measure statistical data privacy in FL. Previous research has focused on over-the-air FL with a single-antenna server, leveraging communication noise to enhance user-level DP . This approach achieves the so-called "free DP" by controlling transmit power rather than introducing additional DP-preserving mechanisms at devices, such as adding artificial noise. In this paper, we study differentially private over-the-air FL over a multiple-input multiple-output (MIMO) fading channel. We show that FL model communication with a multiple-antenna server amplifies privacy leakage when the multiple-antenna server employs separate receive combining for model aggregation and information inference. Consequently, relying solely on communication noise, as done in the multiple-input single-output system, cannot meet high privacy requirements, and a device-side privacy-preserving mechanism is necessary for optimal DP design. We analyze the learning convergence and privacy loss of the studied FL system and propose a transceiver design algorithm based on alternating optimization. Numerical results demonstrate that the proposed method achieves a better privacy-learning trade-off compared to prior work. The emergence of artificial intelligence (AI) applications that leverage massive data generated at the edge of wireless networks has attracted widespread interest [2], [3]. Federate learning (FL) is a popular paradigm for exploiting edge devices' data and computation power for distributed machine learning. FL coordinates the distributive training of an AI model on edge devices by periodically sharing model information with an edge server [4]. This work was supported in part by the General Research Fund (project number 14201920, 14202421, 14214122, 14202723), Area of Excellence Scheme grant (project number AoE/E-601/22-R), and NSFC/RGC Collaborative Research Scheme (project number CRS_HKUST603/22), all from the Research Grants Council of Hong Kong. The work of J. Y an was supported in part by the Guangzhou Municiple Science and Technology Project 2023A03J0011. Part of this work was presented at the IEEE Global Communications Conference (GLOBECOM), Kuala Lumpur, Malaysia, December 2023 [1]. He is now with the Department of Electrical and Computer Engineering at Cornell Tech, Cornell University, NY 10044, USA.