heterogeneous model
Feature Distillation is the Better Choice for Model-Heterogeneous Federated Learning
Li, Yichen, Wang, Xiuying, Xu, Wenchao, Wang, Haozhao, Qi, Yining, Dong, Jiahua, Li, Ruixuan
Model-Heterogeneous Federated Learning (Hetero-FL) has attracted growing attention for its ability to aggregate knowledge from heterogeneous models while keeping private data locally. To better aggregate knowledge from clients, ensemble distillation, as a widely used and effective technique, is often employed after global aggregation to enhance the performance of the global model. However, simply combining Hetero-FL and ensemble distillation does not always yield promising results and can make the training process unstable. The reason is that existing methods primarily focus on logit distillation, which, while being model-agnostic with softmax predictions, fails to compensate for the knowledge bias arising from heterogeneous models. To tackle this challenge, we propose a stable and efficient Feature Distillation for model-heterogeneous Federated learning, dubbed FedFD, that can incorporate aligned feature information via orthogonal projection to integrate knowledge from heterogeneous models better. Specifically, a new feature-based ensemble federated knowledge distillation paradigm is proposed. The global model on the server needs to maintain a projection layer for each client-side model architecture to align the features separately. Orthogonal techniques are employed to re-parameterize the projection layer to mitigate knowledge bias from heterogeneous models and thus maximize the distilled knowledge. Extensive experiments show that FedFD achieves superior performance compared to state-of-the-art methods.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (4 more...)
Hypernetworks for Model-Heterogeneous Personalized Federated Learning
Zhang, Chen, Li, Husheng, Liu, Xiang, Jiang, Linshan, Wang, Danxin
Recent advances in personalized federated learning have focused on addressing client model heterogeneity. However, most existing methods still require external data, rely on model decoupling, or adopt partial learning strategies, which can limit their practicality and scalability. In this paper, we revisit hypernetwork-based methods and leverage their strong generalization capabilities to design a simple yet effective framework for heterogeneous personalized federated learning. Specifically, we propose MH-pFedHN, which leverages a server-side hypernetwork that takes client-specific embedding vectors as input and outputs personalized parameters tailored to each client's heterogeneous model. To promote knowledge sharing and reduce computation, we introduce a multi-head structure within the hypernetwork, allowing clients with similar model sizes to share heads. Furthermore, we further propose MH-pFedHNGD, which integrates an optional lightweight global model to improve generalization. Our framework does not rely on external datasets and does not require disclosure of client model architectures, thereby offering enhanced privacy and flexibility. Extensive experiments on multiple benchmarks and model settings demonstrate that our approach achieves competitive accuracy, strong generalization, and serves as a robust baseline for future research in model-heterogeneous personalized federated learning.
- North America > United States > Virginia (0.04)
- North America > United States > Alaska > Anchorage Municipality > Anchorage (0.04)
- Asia > Singapore (0.04)
- (2 more...)
HtFLlib: A Comprehensive Heterogeneous Federated Learning Library and Benchmark
Zhang, Jianqing, Wu, Xinghao, Zhou, Yanbing, Sun, Xiaoting, Cai, Qiqi, Liu, Yang, Hua, Yang, Zheng, Zhenzhe, Cao, Jian, Yang, Qiang
As AI evolves, collaboration among heterogeneous models helps overcome data scarcity by enabling knowledge transfer across institutions and devices. Traditional Federated Learning (FL) only supports homogeneous models, limiting collaboration among clients with heterogeneous model architectures. To address this, Heterogeneous Federated Learning (HtFL) methods are developed to enable collaboration across diverse heterogeneous models while tackling the data heterogeneity issue at the same time. However, a comprehensive benchmark for standardized evaluation and analysis of the rapidly growing HtFL methods is lacking. Firstly, the highly varied datasets, model heterogeneity scenarios, and different method implementations become hurdles to making easy and fair comparisons among HtFL methods. Secondly, the effectiveness and robustness of HtFL methods are under-explored in various scenarios, such as the medical domain and sensor signal modality. To fill this gap, we introduce the first Heterogeneous Federated Learning Library (HtFLlib), an easy-to-use and extensible framework that integrates multiple datasets and model heterogeneity scenarios, offering a robust benchmark for research and practical applications. Specifically, HtFLlib integrates (1) 12 datasets spanning various domains, modalities, and data heterogeneity scenarios; (2) 40 model architectures, ranging from small to large, across three modalities; (3) a modularized and easy-to-extend HtFL codebase with implementations of 10 representative HtFL methods; and (4) systematic evaluations in terms of accuracy, convergence, computation costs, and communication costs. We emphasize the advantages and potential of state-of-the-art HtFL methods and hope that HtFLlib will catalyze advancing HtFL research and enable its broader applications. The code is released at https://github.com/TsingZ0/HtFLlib.
- Health & Medicine > Diagnostic Medicine (0.93)
- Health & Medicine > Therapeutic Area (0.93)
- Information Technology > Security & Privacy (0.67)
FedADP: Unified Model Aggregation for Federated Learning with Heterogeneous Model Architectures
Wang, Jiacheng, Lv, Hongtao, Liu, Lei
Traditional Federated Learning (FL) faces significant challenges in terms of efficiency and accuracy, particularly in heterogeneous environments where clients employ diverse model architectures and have varying computational resources. Such heterogeneity complicates the aggregation process, leading to performance bottlenecks and reduced model generalizability. To address these issues, we propose FedADP, a federated learning framework designed to adapt to client heterogeneity by dynamically adjusting model architectures during aggregation. FedADP enables effective collaboration among clients with differing capabilities, maximizing resource utilization and ensuring model quality. Our experimental results demonstrate that FedADP significantly outperforms existing methods, such as FlexiFed, achieving an accuracy improvement of up to 23.30%, thereby enhancing model adaptability and training efficiency in heterogeneous real-world settings.
- North America > United States > Virginia (0.05)
- Asia > China (0.04)
Model Assembly Learning with Heterogeneous Layer Weight Merging
Zhang, Yi-Kai, Wang, Jin, Zhong, Xu-Xiang, Zhan, De-Chuan, Ye, Han-Jia
Model merging acquires general capabilities without extra data or training by combining multiple models' parameters. Previous approaches achieve linear mode connectivity by aligning parameters into the same loss basin using permutation invariance. In this paper, we introduce Model Assembly Learning (MAL), a novel paradigm for model merging that iteratively integrates parameters from diverse models in an open-ended model zoo to enhance the base model's capabilities. Unlike previous works that require identical architectures, MAL allows the merging of heterogeneous architectures and selective parameters across layers. Specifically, the base model can incorporate parameters from different layers of multiple pre-trained models. We systematically investigate the conditions and fundamental settings of heterogeneous parameter merging, addressing all possible mismatches in layer widths between the base and target models. Furthermore, we establish key laws and provide practical guidelines for effectively implementing MAL.
- Asia > China > Jiangsu Province > Nanjing (0.05)
- North America > United States > Virginia (0.04)
- Asia > Middle East > Jordan (0.04)
- Africa > Rwanda > Kigali > Kigali (0.04)
Anomaly-Flow: A Multi-domain Federated Generative Adversarial Network for Distributed Denial-of-Service Detection
de Melo, Leonardo Henrique, Bertoli, Gustavo de Carvalho, Nogueira, Michele, Santos, Aldri Luiz dos, Junior, Lourenço Alves Pereira
Distributed denial-of-service (DDoS) attacks remain a critical threat to Internet services, causing costly disruptions. While machine learning (ML) has shown promise in DDoS detection, current solutions struggle with multi-domain environments where attacks must be detected across heterogeneous networks and organizational boundaries. This limitation severely impacts the practical deployment of ML-based defenses in real-world settings. This paper introduces Anomaly-Flow, a novel framework that addresses this critical gap by combining Federated Learning (FL) with Generative Adversarial Networks (GANs) for privacy-preserving, multi-domain DDoS detection. Our proposal enables collaborative learning across diverse network domains while preserving data privacy through synthetic flow generation. Through extensive evaluation across three distinct network datasets, Anomaly-Flow achieves an average F1-score of $0.747$, outperforming baseline models. Importantly, our framework enables organizations to share attack detection capabilities without exposing sensitive network data, making it particularly valuable for critical infrastructure and privacy-sensitive sectors. Beyond immediate technical contributions, this work provides insights into the challenges and opportunities in multi-domain DDoS detection, establishing a foundation for future research in collaborative network defense systems. Our findings have important implications for academic research and industry practitioners working to deploy practical ML-based security solutions.
- South America > Brazil > São Paulo (0.04)
- South America > Brazil > Minas Gerais (0.04)
- North America > United States > New York > New York County > New York City (0.04)
Moss: Proxy Model-based Full-Weight Aggregation in Federated Learning with Heterogeneous Models
Cai, Yifeng, Zhang, Ziqi, Li, Ding, Guo, Yao, Chen, Xiangqun
Modern Federated Learning (FL) has become increasingly essential for handling highly heterogeneous mobile devices. Current approaches adopt a partial model aggregation paradigm that leads to sub-optimal model accuracy and higher training overhead. In this paper, we challenge the prevailing notion of partial-model aggregation and propose a novel "full-weight aggregation" method named Moss, which aggregates all weights within heterogeneous models to preserve comprehensive knowledge. Evaluation across various applications demonstrates that Moss significantly accelerates training, reduces on-device training time and energy consumption, enhances accuracy, and minimizes network bandwidth utilization when compared to state-of-the-art baselines.
- North America > United States > Illinois > Champaign County > Urbana (0.14)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Virginia (0.04)
- (3 more...)
Training-free Heterogeneous Model Merging
Xu, Zhengqi, Zheng, Han, Song, Jie, Sun, Li, Song, Mingli
Model merging has attracted significant attention as a powerful paradigm for model reuse, facilitating the integration of task-specific models into a singular, versatile framework endowed with multifarious capabilities. Previous studies, predominantly utilizing methods such as Weight Average (WA), have shown that model merging can effectively leverage pretrained models without the need for laborious retraining. However, the inherent heterogeneity among models poses a substantial constraint on its applicability, particularly when confronted with discrepancies in model architectures. To overcome this challenge, we propose an innovative model merging framework designed for heterogeneous models, encompassing both depth and width heterogeneity. To address depth heterogeneity, we introduce a layer alignment strategy that harmonizes model layers by segmenting deeper models, treating consecutive layers with similar representations as a cohesive segment, thus enabling the seamless merging of models with differing layer depths. For width heterogeneity, we propose a novel elastic neuron zipping algorithm that projects the weights from models of varying widths onto a common dimensional space, eliminating the need for identical widths. Extensive experiments validate the efficacy of these proposed methods, demonstrating that the merging of structurally heterogeneous models can achieve performance levels comparable to those of homogeneous merging, across both vision and NLP tasks. Our code is publicly available at https://github.com/zju-vipa/training_free_heterogeneous_model_merging.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > Middle East > Jordan (0.04)
Towards Efficient Model-Heterogeneity Federated Learning for Large Models
Jia, Ruofan, Xie, Weiying, Lei, Jie, Qin, Haonan, Ma, Jitao, Fang, Leyuan
As demand grows for complex tasks and high-performance applications in edge computing, the deployment of large models in federated learning has become increasingly urgent, given their superior representational power and generalization capabilities. However, the resource constraints and heterogeneity among clients present significant challenges to this deployment. To tackle these challenges, we introduce HeteroTune, an innovative fine-tuning framework tailored for model-heterogeneity federated learning (MHFL). In particular, we propose a novel parameter-efficient fine-tuning (PEFT) structure, called FedAdapter, which employs a multi-branch cross-model aggregator to enable efficient knowledge aggregation across diverse models. Benefiting from the lightweight FedAdapter, our approach significantly reduces both the computational and communication overhead. Finally, our approach is simple yet effective, making it applicable to a wide range of large model fine-tuning tasks. Extensive experiments on computer vision (CV) and natural language processing (NLP) tasks demonstrate that our method achieves state-of-the-art results, seamlessly integrating efficiency and performance.
- North America > United States > California (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Asia > China (0.04)
FedTSA: A Cluster-based Two-Stage Aggregation Method for Model-heterogeneous Federated Learning
Fan, Boyu, Wu, Chenrui, Su, Xiang, Hui, Pan
Despite extensive research into data heterogeneity in federated learning (FL), system heterogeneity remains a significant yet often overlooked challenge. Traditional FL approaches typically assume homogeneous hardware resources across FL clients, implying that clients can train a global model within a comparable time frame. However, in practical FL systems, clients often have heterogeneous resources, which impacts their training capacity. This discrepancy underscores the importance of exploring model-heterogeneous FL, a paradigm allowing clients to train different models based on their resource capabilities. To address this challenge, we introduce FedTSA, a cluster-based two-stage aggregation method tailored for system heterogeneity in FL. FedTSA begins by clustering clients based on their capabilities, then performs a two-stage aggregation: conventional weight averaging for homogeneous models in Stage 1, and deep mutual learning with a diffusion model for aggregating heterogeneous models in Stage 2. Extensive experiments demonstrate that FedTSA not only outperforms the baselines but also explores various factors influencing model performance, validating FedTSA as a promising approach for model-heterogeneous FL.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (5 more...)