FedMAX: Mitigating Activation Divergence for Accurate and Communication-Efficient Federated Learning

Chen, Wei, Bhardwaj, Kartikeya, Marculescu, Radu

Apr-7-2020–arXiv.org Machine Learning

In this paper, we identify a new phenomenon called activation-divergence which occurs in Federated Learning (FL) due to data heterogeneity (i.e., data being non-IID) across multiple users. Specifically, we argue that the activation vectors in FL can diverge, even if subsets of users share a few common classes with data residing on different devices. To address the activation-divergence issue, we introduce a prior based on the principle of maximum entropy; this prior assumes minimal information about the per-device activation vectors and aims at making the activation vectors of same classes as similar as possible across multiple devices. Our results show that, for both IID and non-IID settings, our proposed approach results in better accuracy (due to the significantly more similar activation vectors across multiple devices), and is more communication-efficient than state-of-the-art approaches in FL. Finally, we illustrate the effectiveness of our approach on a few common benchmarks and two large medical datasets.

activation vector, dataset, fedmax, (11 more...)

arXiv.org Machine Learning

Apr-7-2020

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Virginia (0.04)
    - Texas > Travis County
      - Austin (0.14)
    - Pennsylvania > Allegheny County
      - Pittsburgh (0.14)
    - California > Santa Clara County
      - San Jose (0.04)
  - Canada > Ontario
    - Toronto (0.04)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Health & Medicine > Diagnostic Medicine (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found