AITopics | heterogeneous model

Collaborating Authors

heterogeneous model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Feature Distillation is the Better Choice for Model-Heterogeneous Federated Learning

Li, Yichen, Wang, Xiuying, Xu, Wenchao, Wang, Haozhao, Qi, Yining, Dong, Jiahua, Li, Ruixuan

arXiv.org Artificial IntelligenceOct-15-2025

Model-Heterogeneous Federated Learning (Hetero-FL) has attracted growing attention for its ability to aggregate knowledge from heterogeneous models while keeping private data locally. To better aggregate knowledge from clients, ensemble distillation, as a widely used and effective technique, is often employed after global aggregation to enhance the performance of the global model. However, simply combining Hetero-FL and ensemble distillation does not always yield promising results and can make the training process unstable. The reason is that existing methods primarily focus on logit distillation, which, while being model-agnostic with softmax predictions, fails to compensate for the knowledge bias arising from heterogeneous models. To tackle this challenge, we propose a stable and efficient Feature Distillation for model-heterogeneous Federated learning, dubbed FedFD, that can incorporate aligned feature information via orthogonal projection to integrate knowledge from heterogeneous models better. Specifically, a new feature-based ensemble federated knowledge distillation paradigm is proposed. The global model on the server needs to maintain a projection layer for each client-side model architecture to align the features separately. Orthogonal techniques are employed to re-parameterize the projection layer to mitigate knowledge bias from heterogeneous models and thus maximize the distilled knowledge. Extensive experiments show that FedFD achieves superior performance compared to state-of-the-art methods.

artificial intelligence, distillation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.10348

Country: Asia > China (0.68)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Hypernetworks for Model-Heterogeneous Personalized Federated Learning

Zhang, Chen, Li, Husheng, Liu, Xiang, Jiang, Linshan, Wang, Danxin

arXiv.org Artificial IntelligenceJul-31-2025

Recent advances in personalized federated learning have focused on addressing client model heterogeneity. However, most existing methods still require external data, rely on model decoupling, or adopt partial learning strategies, which can limit their practicality and scalability. In this paper, we revisit hypernetwork-based methods and leverage their strong generalization capabilities to design a simple yet effective framework for heterogeneous personalized federated learning. Specifically, we propose MH-pFedHN, which leverages a server-side hypernetwork that takes client-specific embedding vectors as input and outputs personalized parameters tailored to each client's heterogeneous model. To promote knowledge sharing and reduce computation, we introduce a multi-head structure within the hypernetwork, allowing clients with similar model sizes to share heads. Furthermore, we further propose MH-pFedHNGD, which integrates an optional lightweight global model to improve generalization. Our framework does not rely on external datasets and does not require disclosure of client model architectures, thereby offering enhanced privacy and flexibility. Extensive experiments on multiple benchmarks and model settings demonstrate that our approach achieves competitive accuracy, strong generalization, and serves as a robust baseline for future research in model-heterogeneous personalized federated learning.

artificial intelligence, federated learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.2233

Country:

Asia (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.92)

Industry: Information Technology > Security & Privacy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

HtFLlib: A Comprehensive Heterogeneous Federated Learning Library and Benchmark

Zhang, Jianqing, Wu, Xinghao, Zhou, Yanbing, Sun, Xiaoting, Cai, Qiqi, Liu, Yang, Hua, Yang, Zheng, Zhenzhe, Cao, Jian, Yang, Qiang

arXiv.org Artificial IntelligenceJun-5-2025

As AI evolves, collaboration among heterogeneous models helps overcome data scarcity by enabling knowledge transfer across institutions and devices. Traditional Federated Learning (FL) only supports homogeneous models, limiting collaboration among clients with heterogeneous model architectures. To address this, Heterogeneous Federated Learning (HtFL) methods are developed to enable collaboration across diverse heterogeneous models while tackling the data heterogeneity issue at the same time. However, a comprehensive benchmark for standardized evaluation and analysis of the rapidly growing HtFL methods is lacking. Firstly, the highly varied datasets, model heterogeneity scenarios, and different method implementations become hurdles to making easy and fair comparisons among HtFL methods. Secondly, the effectiveness and robustness of HtFL methods are under-explored in various scenarios, such as the medical domain and sensor signal modality. To fill this gap, we introduce the first Heterogeneous Federated Learning Library (HtFLlib), an easy-to-use and extensible framework that integrates multiple datasets and model heterogeneity scenarios, offering a robust benchmark for research and practical applications. Specifically, HtFLlib integrates (1) 12 datasets spanning various domains, modalities, and data heterogeneity scenarios; (2) 40 model architectures, ranging from small to large, across three modalities; (3) a modularized and easy-to-extend HtFL codebase with implementations of 10 representative HtFL methods; and (4) systematic evaluations in terms of accuracy, convergence, computation costs, and communication costs. We emphasize the advantages and potential of state-of-the-art HtFL methods and hope that HtFLlib will catalyze advancing HtFL research and enable its broader applications. The code is released at https://github.com/TsingZ0/HtFLlib.

artificial intelligence, deep learning, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2506.03954

Country:

Asia > China (0.70)
North America (0.70)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (0.93)
Health & Medicine > Therapeutic Area (0.93)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

FedADP: Unified Model Aggregation for Federated Learning with Heterogeneous Model Architectures

Wang, Jiacheng, Lv, Hongtao, Liu, Lei

arXiv.org Artificial IntelligenceMay-13-2025

Traditional Federated Learning (FL) faces significant challenges in terms of efficiency and accuracy, particularly in heterogeneous environments where clients employ diverse model architectures and have varying computational resources. Such heterogeneity complicates the aggregation process, leading to performance bottlenecks and reduced model generalizability. To address these issues, we propose FedADP, a federated learning framework designed to adapt to client heterogeneity by dynamically adjusting model architectures during aggregation. FedADP enables effective collaboration among clients with differing capabilities, maximizing resource utilization and ensuring model quality. Our experimental results demonstrate that FedADP significantly outperforms existing methods, such as FlexiFed, achieving an accuracy improvement of up to 23.30%, thereby enhancing model adaptability and training efficiency in heterogeneous real-world settings.

artificial intelligence, machine learning, object-oriented architecture, (14 more...)

arXiv.org Artificial Intelligence

2505.06497

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.82)

Add feedback

Model Assembly Learning with Heterogeneous Layer Weight Merging

Zhang, Yi-Kai, Wang, Jin, Zhong, Xu-Xiang, Zhan, De-Chuan, Ye, Han-Jia

arXiv.org Artificial IntelligenceMar-27-2025

Model merging acquires general capabilities without extra data or training by combining multiple models' parameters. Previous approaches achieve linear mode connectivity by aligning parameters into the same loss basin using permutation invariance. In this paper, we introduce Model Assembly Learning (MAL), a novel paradigm for model merging that iteratively integrates parameters from diverse models in an open-ended model zoo to enhance the base model's capabilities. Unlike previous works that require identical architectures, MAL allows the merging of heterogeneous architectures and selective parameters across layers. Specifically, the base model can incorporate parameters from different layers of multiple pre-trained models. We systematically investigate the conditions and fundamental settings of heterogeneous parameter merging, addressing all possible mismatches in layer widths between the base and target models. Furthermore, we establish key laws and provide practical guidelines for effectively implementing MAL.

architecture, artificial intelligence, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2503.21657

Country:

Asia > China > Jiangsu Province > Nanjing (0.05)
North America > United States > Virginia (0.04)
Asia > Middle East > Jordan (0.04)
Africa > Rwanda > Kigali > Kigali (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)

Add feedback

Anomaly-Flow: A Multi-domain Federated Generative Adversarial Network for Distributed Denial-of-Service Detection

de Melo, Leonardo Henrique, Bertoli, Gustavo de Carvalho, Nogueira, Michele, Santos, Aldri Luiz dos, Junior, Lourenço Alves Pereira

arXiv.org Artificial IntelligenceMar-18-2025

Distributed denial-of-service (DDoS) attacks remain a critical threat to Internet services, causing costly disruptions. While machine learning (ML) has shown promise in DDoS detection, current solutions struggle with multi-domain environments where attacks must be detected across heterogeneous networks and organizational boundaries. This limitation severely impacts the practical deployment of ML-based defenses in real-world settings. This paper introduces Anomaly-Flow, a novel framework that addresses this critical gap by combining Federated Learning (FL) with Generative Adversarial Networks (GANs) for privacy-preserving, multi-domain DDoS detection. Our proposal enables collaborative learning across diverse network domains while preserving data privacy through synthetic flow generation. Through extensive evaluation across three distinct network datasets, Anomaly-Flow achieves an average F1-score of $0.747$, outperforming baseline models. Importantly, our framework enables organizations to share attack detection capabilities without exposing sensitive network data, making it particularly valuable for critical infrastructure and privacy-sensitive sectors. Beyond immediate technical contributions, this work provides insights into the challenges and opportunities in multi-domain DDoS detection, establishing a foundation for future research in collaborative network defense systems. Our findings have important implications for academic research and industry practitioners working to deploy practical ML-based security solutions.

artificial intelligence, detection, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.14618

Country:

South America > Brazil > São Paulo (0.04)
South America > Brazil > Minas Gerais (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Moss: Proxy Model-based Full-Weight Aggregation in Federated Learning with Heterogeneous Models

Cai, Yifeng, Zhang, Ziqi, Li, Ding, Guo, Yao, Chen, Xiangqun

arXiv.org Artificial IntelligenceMar-13-2025

Modern Federated Learning (FL) has become increasingly essential for handling highly heterogeneous mobile devices. Current approaches adopt a partial model aggregation paradigm that leads to sub-optimal model accuracy and higher training overhead. In this paper, we challenge the prevailing notion of partial-model aggregation and propose a novel "full-weight aggregation" method named Moss, which aggregates all weights within heterogeneous models to preserve comprehensive knowledge. Evaluation across various applications demonstrates that Moss significantly accelerates training, reduces on-device training time and energy consumption, enhances accuracy, and minimizes network bandwidth utilization when compared to state-of-the-art baselines.

architecture, heterogeneous model, moss, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3712274

2503.10218

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Virginia (0.04)
(3 more...)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.46)

Add feedback

Training-free Heterogeneous Model Merging

Xu, Zhengqi, Zheng, Han, Song, Jie, Sun, Li, Song, Mingli

arXiv.org Artificial IntelligenceDec-28-2024

Model merging has attracted significant attention as a powerful paradigm for model reuse, facilitating the integration of task-specific models into a singular, versatile framework endowed with multifarious capabilities. Previous studies, predominantly utilizing methods such as Weight Average (WA), have shown that model merging can effectively leverage pretrained models without the need for laborious retraining. However, the inherent heterogeneity among models poses a substantial constraint on its applicability, particularly when confronted with discrepancies in model architectures. To overcome this challenge, we propose an innovative model merging framework designed for heterogeneous models, encompassing both depth and width heterogeneity. To address depth heterogeneity, we introduce a layer alignment strategy that harmonizes model layers by segmenting deeper models, treating consecutive layers with similar representations as a cohesive segment, thus enabling the seamless merging of models with differing layer depths. For width heterogeneity, we propose a novel elastic neuron zipping algorithm that projects the weights from models of varying widths onto a common dimensional space, eliminating the need for identical widths. Extensive experiments validate the efficacy of these proposed methods, demonstrating that the merging of structurally heterogeneous models can achieve performance levels comparable to those of homogeneous merging, across both vision and NLP tasks. Our code is publicly available at https://github.com/zju-vipa/training_free_heterogeneous_model_merging.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.00061

Genre: Research Report > Promising Solution (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Towards Efficient Model-Heterogeneity Federated Learning for Large Models

Jia, Ruofan, Xie, Weiying, Lei, Jie, Qin, Haonan, Ma, Jitao, Fang, Leyuan

arXiv.org Artificial IntelligenceNov-25-2024

As demand grows for complex tasks and high-performance applications in edge computing, the deployment of large models in federated learning has become increasingly urgent, given their superior representational power and generalization capabilities. However, the resource constraints and heterogeneity among clients present significant challenges to this deployment. To tackle these challenges, we introduce HeteroTune, an innovative fine-tuning framework tailored for model-heterogeneity federated learning (MHFL). In particular, we propose a novel parameter-efficient fine-tuning (PEFT) structure, called FedAdapter, which employs a multi-branch cross-model aggregator to enable efficient knowledge aggregation across diverse models. Benefiting from the lightweight FedAdapter, our approach significantly reduces both the computational and communication overhead. Finally, our approach is simple yet effective, making it applicable to a wide range of large model fine-tuning tasks. Extensive experiments on computer vision (CV) and natural language processing (NLP) tasks demonstrate that our method achieves state-of-the-art results, seamlessly integrating efficiency and performance.

aggregation, dataset, federated learning, (14 more...)

arXiv.org Artificial Intelligence

2411.16796

Country:

North America > United States > California (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

FedTSA: A Cluster-based Two-Stage Aggregation Method for Model-heterogeneous Federated Learning

Fan, Boyu, Wu, Chenrui, Su, Xiang, Hui, Pan

arXiv.org Artificial IntelligenceJul-15-2024

Despite extensive research into data heterogeneity in federated learning (FL), system heterogeneity remains a significant yet often overlooked challenge. Traditional FL approaches typically assume homogeneous hardware resources across FL clients, implying that clients can train a global model within a comparable time frame. However, in practical FL systems, clients often have heterogeneous resources, which impacts their training capacity. This discrepancy underscores the importance of exploring model-heterogeneous FL, a paradigm allowing clients to train different models based on their resource capabilities. To address this challenge, we introduce FedTSA, a cluster-based two-stage aggregation method tailored for system heterogeneity in FL. FedTSA begins by clustering clients based on their capabilities, then performs a two-stage aggregation: conventional weight averaging for homogeneous models in Stage 1, and deep mutual learning with a diffusion model for aggregating heterogeneous models in Stage 2. Extensive experiments demonstrate that FedTSA not only outperforms the baselines but also explores various factors influencing model performance, validating FedTSA as a promising approach for model-heterogeneous FL.

diffusion model, federated learning, heterogeneity, (15 more...)

arXiv.org Artificial Intelligence

2407.05098

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Finland > Uusimaa > Helsinki (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.48)

Industry: Information Technology > Security & Privacy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback