AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.59)

Jaramillo-Civill, Mariona, Wu, Peng, Closas, Pau

DPMM-CFL: Clustered Federated Learning via Dirichlet Process Mixture Model Nonparametric Clustering

arXiv.org Machine LearningOct-9-2025

Clustered Federated Learning (CFL) improves performance under non-IID client heterogeneity by clustering clients and training one model per cluster, thereby balancing between a global model and fully personalized models. However, most CFL methods require the number of clusters K to be fixed a priori, which is impractical when the latent structure is unknown. We propose DPMM-CFL, a CFL algorithm that places a Dirichlet Process (DP) prior over the distribution of cluster parameters. This enables nonparametric Bayesian inference to jointly infer both the number of clusters and client assignments, while optimizing per-cluster federated objectives. This results in a method where, at each round, federated updates and cluster inferences are coupled, as presented in this paper. The algorithm is validated on benchmark datasets under Dirichlet and class-split non-IID partitions.

assignment, clustered federated learning, federated learning, (12 more...)

arXiv.org Machine Learning

2510.07132

Country:

North America > United States > Virginia (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceAug-22-2025

MMiC: Mitigating Modality Incompleteness in Clustered Federated Learning

Yang, Lishan, Zhang, Wei Emma, Sheng, Quan Z., Yao, Lina, Chen, Weitong, Shakeri, Ali

In the era of big data, data mining has become indispensable for uncovering hidden patterns and insights from vast and complex datasets. The integration of multimodal data sources further enhances its potential. Multimodal Federated Learning (MFL) is a distributed approach that enhances the efficiency and quality of multimodal learning, ensuring collaborative work and privacy protection. However, missing modalities pose a significant challenge in MFL, often due to data quality issues or privacy policies across the clients. In this work, we present MMiC, a framework for Mitigating Modality incompleteness in MFL within the Clusters. MMiC replaces partial parameters within client models inside clusters to mitigate the impact of missing modalities. Furthermore, it leverages the Banzhaf Power Index to optimize client selection under these conditions. Finally, MMiC employs an innovative approach to dynamically control global aggregation by utilizing Markovitz Portfolio Optimization. Extensive experiments demonstrate that MMiC consistently outperforms existing federated learning architectures in both global and personalized performance on multimodal datasets with missing modalities, confirming the effectiveness of our proposed solution. Our code is available at https://github.com/gotobcn8/MMiC.

artificial intelligence, data mining, machine learning, (17 more...)

doi: 10.1145/3746252.3761140

2505.06911

Country:

North America > United States (1.00)
Oceania (0.94)
Europe (0.93)

Genre:

Overview > Innovation (0.34)
Research Report > Promising Solution (0.34)

Industry: Information Technology (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Helcig, Michael A., Nastic, Stefan

FedCCL: Federated Clustered Continual Learning Framework for Privacy-focused Energy Forecasting

arXiv.org Artificial IntelligenceJul-8-2025

--Privacy-preserving distributed model training is crucial for modern machine learning applications, yet existing Federated Learning approaches struggle with heterogeneous data distributions and varying computational capabilities. Traditional solutions either treat all participants uniformly or require costly dynamic clustering during training, leading to reduced efficiency and delayed model specialization. We present FedCCL (Federated Clustered Continual Learning), a framework specifically designed for environments with static organizational characteristics but dynamic client availability. By combining static pre-training clustering with an adapted asynchronous FedA vg algorithm, Fed-CCL enables new clients to immediately profit from specialized models without prior exposure to their data distribution, while maintaining reduced coordination overhead and resilience to client disconnections. Our approach implements an asynchronous Federated Learning protocol with a three-tier model topology -- global, cluster-specific, and local models -- that efficiently manages knowledge sharing across heterogeneous participants. Evaluation using photovoltaic installations across central Europe demonstrates that FedCCL's location-based clustering achieves an energy prediction error of 3.93% ( 0.21%), while maintaining data privacy and showing that the framework maintains stability for population-independent deployments, with 0.14 percentage point degradation in performance for new installations. The results demonstrate that FedCCL offers an effective framework for privacy-preserving distributed learning, maintaining high accuracy and adaptability even with dynamic participant populations. The Federated Learning (FL) paradigm [1], [2] has emerged as a pivotal solution for privacy-preserving machine learning, enabling multiple participants to collaboratively train models while maintaining data privacy.

artificial intelligence, data mining, machine learning, (20 more...)

doi: 10.1109/ICFEC65699.2025.00012

2504.20282

Country:

North America > United States (0.68)
Europe (0.55)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Energy > Renewable > Solar (1.00)
Energy > Power Industry (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.75)

arXiv.org Artificial IntelligenceJun-17-2025

EBS-CFL: Efficient and Byzantine-robust Secure Clustered Federated Learning

Li, Zhiqiang, Bao, Haiyong, Guan, Menghong, Pan, Hao, Huang, Cheng, Dai, Hong-Ning

Despite federated learning (FL)'s potential in collaborative learning, its performance has deteriorated due to the data heterogeneity of distributed users. Recently, clustered federated learning (CFL) has emerged to address this challenge by partitioning users into clusters according to their similarity. However, CFL faces difficulties in training when users are unwilling to share their cluster identities due to privacy concerns. To address these issues, we present an innovative Efficient and Robust Secure Aggregation scheme for CFL, dubbed EBS-CFL. The proposed EBS-CFL supports effectively training CFL while maintaining users' cluster identity confidentially. Moreover, it detects potential poisonous attacks without compromising individual client gradients by discarding negatively correlated gradients and aggregating positively correlated ones using a weighted approach. The server also authenticates correct gradient encoding by clients. EBS-CFL has high efficiency with client-side overhead O(ml + m^2) for communication and O(m^2l) for computation, where m is the number of cluster identities, and l is the gradient size. When m = 1, EBS-CFL's computational efficiency of client is at least O(log n) times better than comparison schemes, where n is the number of clients.In addition, we validate the scheme through extensive experiments. Finally, we theoretically prove the scheme's security.

artificial intelligence, gradient, machine learning, (17 more...)

doi: 10.1609/aaai.v39i17.34046

2506.13612

Country:

Asia > China (0.28)
Europe > Austria (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Neural Information Processing SystemsFeb-7-2025, 10:16:54 GMT

Review for NeurIPS paper: An Efficient Framework for Clustered Federated Learning

Additional Feedback: Empirical Analysis: - The approach is not compared to related work. Straight-forward baselines would be clustering on the central machine approach [9] or the fine-tuning of global models [7, 35] which are cited in the paper. Theoretical Analysis: My main concern with the theoretical analysis is the assumption that initial models are already very close their correct clusters (1/4 of the minimum distance between cluster centers for the linear models - for the strong convex problems an additional factor comes in that depends on the strong convexity and smoothness of the loss). I would argue that if models would be initialized this way, then performing a clustering on the initial models should already give the right clusters. A minor issue is that the convergence rate seems not to address the number of participating workers (line 4 of Algo.

clustered federated learning, efficient framework, neurips paper, (10 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-7-2025, 10:16:46 GMT

Review for NeurIPS paper: An Efficient Framework for Clustered Federated Learning

Reviewers agree that the central idea is simple, which can be seen as a strength, and that the analysis is valuable. The concern about comparison only to baselines and not a more real-world method will be rectified by including the promised comparison to ClusteredFL. Without this comparison at submission, we must assume it will be on par, and therefore the significance of the result is reduced. The statements about reduced computation at the central server can also be accompanied by the statements abour privacy benefits (not sending user data to the server), even given the provisos at line 347.

clustered federated learning, efficient framework, neurips paper, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Licciardi, Alessandro, Leo, Davide, Faní, Eros, Caputo, Barbara, Ciccone, Marco

Interaction-Aware Gaussian Weighting for Clustered Federated Learning

arXiv.org Artificial IntelligenceFeb-5-2025

Federated Learning (FL) emerged as a decentralized paradigm to train models while preserving privacy. However, conventional FL struggles with data heterogeneity and class imbalance, which degrade model performance. Clustered FL balances personalization and decentralized training by grouping clients with analogous data distributions, enabling improved accuracy while adhering to privacy constraints. This approach effectively mitigates the adverse impact of heterogeneity in FL. In this work, we propose a novel clustered FL method, FedGWC (Federated Gaussian Weighting Clustering), which groups clients based on their data distribution, allowing training of a more robust and personalized model on the identified clusters. FedGWC identifies homogeneous clusters by transforming individual empirical losses to model client interactions with a Gaussian reward mechanism. Additionally, we introduce the Wasserstein Adjusted Score, a new clustering metric for FL to evaluate cluster cohesion with respect to the individual class distribution. Our experiments on benchmark datasets show that FedGWC outperforms existing FL algorithms in cluster quality and classification accuracy, validating the efficacy of our approach.

fedgwc, interaction-aware gaussian weighting, machine learning, (15 more...)

2502.0334

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Neural Information Processing SystemsOct-11-2024, 14:54:11 GMT

An Efficient Framework for Clustered Federated Learning

clustered federated learning, efficient framework, federated learning, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.62)

Konti, Xenia, Riess, Hans, Giannopoulos, Manos, Shen, Yi, Pencina, Michael J., Economou-Zavlanos, Nicoleta J., Zavlanos, Michael M.

Distributionally Robust Clustered Federated Learning: A Case Study in Healthcare

arXiv.org Artificial IntelligenceOct-9-2024

In this paper, we address the challenge of heterogeneous data distributions in cross-silo federated learning by introducing a novel algorithm, which we term Cross-silo Robust Clustered Federated Learning (CS-RCFL). Our approach leverages the Wasserstein distance to construct ambiguity sets around each client's empirical distribution that capture possible distribution shifts in the local data, enabling evaluation of worst-case model performance. We then propose a model-agnostic integer fractional program to determine the optimal distributionally robust clustering of clients into coalitions so that possible biases in the local models caused by statistically heterogeneous client datasets are avoided, and analyze our method for linear and logistic regression models. Finally, we discuss a federated learning protocol that ensures the privacy of client distributions, a critical consideration, for instance, when clients are healthcare institutions. We evaluate our algorithm on synthetic and real-world healthcare data.

federated learning, hospital, learning, (14 more...)

2410.07039

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.36)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Providers & Services (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.73)