Federated Learning with Differential Privacy: An Utility-Enhanced Approach
Ranaweera, Kanishka, Nguyen, Dinh C., Pathirana, Pubudu N., Smith, David, Ding, Ming, Rakotoarivelo, Thierry, Seneviratne, Aruna
–arXiv.org Artificial Intelligence
Abstract--Federated learning has emerged as an attractive approach to protect data privacy by eliminating the need for sharing clients' data while reducing communication costs compared with centralized machine learning algorithms. However, recent studies have shown that federated learning alone does not guarantee privacy, as private data may still be inferred from the uploaded parameters to the central server. In order to successfully avoid data leakage, adopting differential privacy (DP) in the local optimization process or in the local update aggregation process has emerged as two feasible ways for achieving sample-level or user-level privacy guarantees respectively, in federated learning models. However, compared to their non-private equivalents, these approaches suffer from a poor utility . T o improve the privacy-utility trade-off, we present a modification to these vanilla differentially private algorithms based on a Haar wavelet transformation step and a novel noise injection scheme that significantly lowers the asymptotic bound of the noise variance. We also present a holistic convergence analysis of our proposed algorithm, showing that our method yields better convergence performance than the vanilla DP algorithms. Numerical experiments on real-world datasets demonstrate that our method outperforms existing approaches in model utility while maintaining the same privacy guarantees. Machine learning (ML) has become an essential tool to analyze this data and extract valuable insights for various applications, including facial recognition, data analytics, weather prediction, and speech recognition, among others [1], [2], [3], [4], [5]. However, in real-world settings, data -- particularly personal data -- is often created and stored on end-user devices. The majority of traditional ML algorithms require the centralization of these training data, which involves collecting and processing data at a potent cloud-based server [6], [7]. This process carries significant risks to data integrity and privacy, particularly when it comes to personal data. Kanishka Ranaweera is with School of Engineering and Built Environment, Deakin University, Waurn Ponds, VIC 3216, Australia, and also with the Data61, CSIRO, Eveleigh, NSW 2015, Australia. Dinh C. Nguyen is with the Department of Electrical and Computer Engineering, The University of Alabama in Huntsville Alabama, USA. Pubudu N. Pathirana is with School of Engineering and Built Environment, Deakin University, Waurn Ponds, VIC 3216, Australia.
arXiv.org Artificial Intelligence
Mar-27-2025
- Country:
- North America > United States
- Alabama > Madison County > Huntsville (0.24)
- Oceania > Australia (0.74)
- North America > United States
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: