Non-IID data in Federated Learning: A Survey with Taxonomy, Metrics, Methods, Frameworks and Future Directions
G., Daniel M. Jimenez, Solans, David, Heikkila, Mikko, Vitaletti, Andrea, Kourtellis, Nicolas, Anagnostopoulos, Aris, Chatzigiannakis, Ioannis
–arXiv.org Artificial Intelligence
Recent advances in machine learning have highlighted Federated Learning (FL) as a promising approach that enables multiple distributed users (so-called clients) to collectively train ML models without sharing their private data. While this privacy-preserving method shows potential, it struggles when data across clients is not independent and identically distributed (non-IID) data. The latter remains an unsolved challenge that can result in poorer model performance and slower training times. Despite the significance of non-IID data in FL, there is a lack of consensus among researchers about its classification and quantification. This technical survey aims to fill that gap by providing a detailed taxonomy for non-IID data, partition protocols, and metrics to quantify data heterogeneity. Additionally, we describe popular solutions to address non-IID data and standardized frameworks employed in FL with heterogeneous data. Based on our state-of-the-art survey, we present key lessons learned and suggest promising future research directions.
arXiv.org Artificial Intelligence
Dec-12-2024
- Country:
- Asia
- China (0.04)
- Japan (0.04)
- Middle East > Qatar
- Arabian Gulf (0.04)
- Singapore (0.04)
- South Korea (0.04)
- Europe
- North America
- Canada > Quebec
- Montreal (0.04)
- United States (0.04)
- Canada > Quebec
- Oceania > Australia (0.04)
- Asia
- Genre:
- Overview (1.00)
- Research Report > Promising Solution (0.87)
- Industry:
- Education (1.00)
- Health & Medicine > Diagnostic Medicine (0.92)
- Information Technology > Security & Privacy (1.00)
- Technology: