Goto

Collaborating Authors

 feddf



EnsembleDistillationforRobustModelFusionin FederatedLearning

Neural Information Processing Systems

Federated Learning (FL) has emerged as an important machine learning paradigm in which a federation of clients participate in collaborative training of a centralized model [62, 51, 65, 8, 5, 42, 34]. The clients send their model parameters to the server but never their private training datasets, thereby ensuring abasic levelofprivacy.




is formulating a robust, efficient training scheme with extensive results and analysis which is significant enough. "

Neural Information Processing Systems

We thank all reviewers for their time and their valuable feedback. We will add corrections/clarifications as suggested. We would like to emphasize our contribution, as summarized by R5's thorough review: "What's novel about this paper Our proposed FedDF is not the mentioned engineering solution. FL round; thus mutual beneficial information can be shared across architectures . GAN training is not involved in all stages of FL and cannot steal clients' Data generation is done by the (frozen) generator before the FL training by performing inference on random noise.


Review for NeurIPS paper: Ensemble Distillation for Robust Model Fusion in Federated Learning

Neural Information Processing Systems

Strengths: This work manifests solid understanding of key requirements and challenges of federated learning, and thus presents a practical solution with significant improvements. The contribution of this paper is formulating a robust, efficient training scheme in FL with extensive results and analysis, which is relevant to the NeurIPS community. They provide sufficient justifications about why the additional computations are negligible in practice and why the reduced number of communication rounds and the ability to handle architecture heterogeneity of FedDF matter more. The authors analyzed its contribution from various angles including efficiency, utilizing heterogeneous computation resources of clients, robustness on the choice of distillation dataset, and handling heterogeneous client data by mitigating quality loss of batch normalization with different data distributions. The results are sensible and believable.


A Unified Solution to Diverse Heterogeneities in One-shot Federated Learning

Bai, Jun, Song, Yiliao, Wu, Di, Sajjanhar, Atul, Xiang, Yong, Zhou, Wei, Tao, Xiaohui, Li, Yan

arXiv.org Artificial Intelligence

One-shot federated learning (FL) limits the communication between the server and clients to a single round, which largely decreases the privacy leakage risks in traditional FLs requiring multiple communications. However, we find existing one-shot FL frameworks are vulnerable to distributional heterogeneity due to their insufficient focus on data heterogeneity while concentrating predominantly on model heterogeneity. Filling this gap, we propose a unified, data-free, one-shot federated learning framework (FedHydra) that can effectively address both model and data heterogeneity. Rather than applying existing value-only learning mechanisms, a structure-value learning mechanism is proposed in FedHydra. Specifically, a new stratified learning structure is proposed to cover data heterogeneity, and the value of each item during computation reflects model heterogeneity. By this design, the data and model heterogeneity issues are simultaneously monitored from different aspects during learning. Consequently, FedHydra can effectively mitigate both issues by minimizing their inherent conflicts. We compared FedHydra with three SOTA baselines on four benchmark datasets. Experimental results show that our method outperforms the previous one-shot FL methods in both homogeneous and heterogeneous settings.


Ensemble Distillation for Robust Model Fusion in Federated Learning

Lin, Tao, Kong, Lingjing, Stich, Sebastian U., Jaggi, Martin

arXiv.org Machine Learning

Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model while keeping the training data decentralized. In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side. However, directly averaging model parameters is only possible if all models have the same structure and size, which could be a restrictive constraint in many scenarios. In this work we investigate more powerful and more flexible aggregation schemes for FL. Specifically, we propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients. This knowledge distillation technique mitigates privacy risk and cost to the same extent as the baseline FL algorithms, but allows flexible aggregation over heterogeneous client models that can differ e.g. in size, numerical precision or structure. We show in extensive empirical experiments on various CV/NLP datasets (CIFAR-10/100, ImageNet, AG News, SST2) and settings (heterogeneous models/data) that the server model can be trained much faster, requiring fewer communication rounds than any existing FL technique so far.