Fair Distributed Machine Learning with Imbalanced Data as a Stackelberg Evolutionary Game
Niehaus, Sebastian, Roeder, Ingo, Scherf, Nico
–arXiv.org Artificial Intelligence
Decentralized data refers to the distribution of data across multiple, often geographically dispersed locations or sources, rather than centralizing it at a single site, server, or storage location. This decentralization of data is becoming more common due to the proliferation of connected devices, edge computing, and privacy concerns. While decentralized data offers advantages in terms of data security, privacy, and accessibility, it poses significant challenges for the training of machine learning algorithms. The challenge of decentralised data is addressed through decentralised machine learning [1] [3] by enabling model training across multiple nodes without the need to centralise the data. Techniques such as federated learning [15] allow the data to remain on local devices, while only model updates are shared and aggregated, preserving privacy and reducing the risk of data breaches [36]. This approach not only increases data security, but also enables compliance with data protection regulations and improves scalability by utilising the computing power of numerous decentralised nodes. A particular challenge in these decentralized learning setups are domains with very different distributions in the individual nodes [11]. This problem is referred to as non-independent and identically distributed (non-iid) data [17] and concerns distribution differences in the labels of the data that can arise due to user behaviour, geographical differences, different levels of knowledge, socio-cultural differences or technical differences in the recording devices [20]. In medical use cases, the problem arises due to the large differences between the nodes that are also the data generators.
arXiv.org Artificial Intelligence
Dec-20-2024
- Country:
- Asia > India (0.04)
- Europe > Germany
- North America > United States (0.04)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: