soft prediction
- North America > United States > Virginia (0.04)
- Asia > China > Hong Kong (0.04)
- Europe > Austria (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Virginia (0.05)
- Asia > China > Hong Kong (0.04)
- Europe > Austria (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
Parameterized Knowledge Transfer for Personalized Federated Learning
In recent years, personalized federated learning (pFL) has attracted increasing attention for its potential in dealing with statistical heterogeneity among clients. However, the state-of-the-art pFL methods rely on model parameters aggregation at the server side, which require all models to have the same structure and size, and thus limits the application for more heterogeneous scenarios. To deal with such model constraints, we exploit the potentials of heterogeneous model settings and propose a novel training framework to employ personalized models for different clients. Specifically, we formulate the aggregation procedure in original pFL into a personalized group knowledge transfer training algorithm, namely, KT-pFL, which enables each client to maintain a personalized soft prediction at the server side to guide the others' local training. KT-pFL updates the personalized soft prediction of each client by a linear combination of all local soft predictions using a knowledge coefficient matrix, which can adaptively reinforce the collaboration among clients who own similar data distribution. Furthermore, to quantify the contributions of each client to others' personalized training, the knowledge coefficient matrix is parameterized so that it can be trained simultaneously with the models. The knowledge coefficient matrix and the model parameters are alternatively updated in each round following the gradient descent way. Extensive experiments on various datasets (EMNIST, Fashion_MNIST, CIFAR-10) are conducted under different settings (heterogeneous models and data distributions). It is demonstrated that the proposed framework is the first federated learning paradigm that realizes personalized model training via parameterized group knowledge transfer while achieving significant performance gain comparing with state-of-the-art algorithms.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Virginia (0.04)
- Europe > United Kingdom > Wales > Ceredigion > Aberystwyth (0.04)
- (2 more...)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Virginia (0.04)
- Europe > United Kingdom > Wales > Ceredigion > Aberystwyth (0.04)
- (2 more...)
- North America > United States > Virginia (0.05)
- Asia > China > Hong Kong (0.04)
- Europe > Austria (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Virginia (0.05)
- Asia > China > Hong Kong (0.04)
- Europe > Austria (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
Optimal Transceiver Design in Over-the-Air Federated Distillation
Hu, Zihao, Yan, Jia, Zhang, Ying-Jun Angela, Zhang, Jun, Letaief, Khaled B.
The rapid proliferation and growth of artificial intelligence (AI) has led to the development of federated learning (FL). FL allows wireless devices (WDs) to cooperatively learn by sharing only local model parameters, without needing to share the entire dataset. However, the emergence of large AI models has made existing FL approaches inefficient, due to the significant communication overhead required. In this paper, we propose a novel over-the-air federated distillation (FD) framework by synergizing the strength of FL and knowledge distillation to avoid the heavy local model transmission. Instead of sharing the model parameters, only the WDs' model outputs, referred to as knowledge, are shared and aggregated over-the-air by exploiting the superposition property of the multiple-access channel. We shall study the transceiver design in over-the-air FD, aiming to maximize the learning convergence rate while meeting the power constraints of the transceivers. The main challenge lies in the intractability of the learning performance analysis, as well as the non-convex nature and the optimization spanning the whole FD training period. To tackle this problem, we first derive an analytical expression of the convergence rate in over-the-air FD. Then, the closed-form optimal solutions of the WDs' transmit power and the estimator for over-the-air aggregation are obtained given the receiver combining strategy. Accordingly, we put forth an efficient approach to find the optimal receiver beamforming vector via semidefinite relaxation. We further prove that there is no optimality gap between the original and relaxed problem for the receiver beamforming design. Numerical results will show that the proposed over-the-air FD approach achieves a significant reduction in communication overhead, with only a minor compromise in testing accuracy compared to conventional FL benchmarks.
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Asia > Singapore (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Communications > Networks (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Parameterized Knowledge Transfer for Personalized Federated Learning
In recent years, personalized federated learning (pFL) has attracted increasing attention for its potential in dealing with statistical heterogeneity among clients. However, the state-of-the-art pFL methods rely on model parameters aggregation at the server side, which require all models to have the same structure and size, and thus limits the application for more heterogeneous scenarios. To deal with such model constraints, we exploit the potentials of heterogeneous model settings and propose a novel training framework to employ personalized models for different clients. Specifically, we formulate the aggregation procedure in original pFL into a personalized group knowledge transfer training algorithm, namely, KT-pFL, which enables each client to maintain a personalized soft prediction at the server side to guide the others' local training. KT-pFL updates the personalized soft prediction of each client by a linear combination of all local soft predictions using a knowledge coefficient matrix, which can adaptively reinforce the collaboration among clients who own similar data distribution.
The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation
Chen, Huancheng, Johnny, null, Wang, null, Vikalo, Haris
Heterogeneity of data distributed across clients limits the performance of global models trained through federated learning, especially in the settings with highly imbalanced class distributions of local datasets. In recent years, personalized federated learning (pFL) has emerged as a potential solution to the challenges presented by heterogeneous data. However, existing pFL methods typically enhance performance of local models at the expense of the global model's accuracy. We propose FedHKD (Federated Hyper-Knowledge Distillation), a novel FL algorithm in which clients rely on knowledge distillation (KD) to train local models. In particular, each client extracts and sends to the server the means of local data representations and the corresponding soft predictions - information that we refer to as "hyper-knowledge". Notably, unlike other KD-based pFL methods, FedHKD does not rely on a public dataset nor it deploys a generative model at the server. We analyze convergence of FedHKD and conduct extensive experiments on visual datasets in a variety of scenarios, demonstrating that FedHKD provides significant improvement in both personalized as well as global model performance compared to state-of-the-art FL methods designed for heterogeneous data settings. Federated learning (FL), a communication-efficient and privacy-preserving alternative to training on centrally aggregated data, relies on collaboration between clients who own local data to train a global machine learning model.
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > United States > Virginia (0.04)