AITopics | client drift

Collaborating Authors

client drift

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ILoRA: Federated Learning with Low-Rank Adaptation for Heterogeneous Client Aggregation

Zhou, Junchao, Liu, Junkang, Shang, Fanhua

arXiv.org Artificial IntelligenceNov-21-2025

Federated Learning with Low-Rank Adaptation (LoRA) faces three critical challenges under client heterogeneity: (1) Initialization-Induced Instability due to random initialization misaligning client subspaces; (2) Rank Incompatibility and Aggregation Error when averaging LoRA parameters of different ranks, which biases the global model; and (3) exacerbated Client Drift under Non-IID Data, impairing generalization. T o address these challenges, we propose ILoRA, a unified framework that integrates three core innovations: a QR-based orthonormal initialization to ensure all clients start in a coherent subspace; a Concatenated QR Aggregation mechanism that fuses heterogeneous-rank updates via concatenation and decomposition, preserving information while maintaining dimension alignment; and an AdamW optimizer with rank-aware control variates to correct local updates and mitigate client drift. Supported by theoretical convergence guarantees, extensive experiments on vision and NLP benchmarks demonstrate that ILoRA consistently achieves superior accuracy and convergence stability compared to existing federated LoRA methods.

heterogeneity, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2511.16069

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

FedMuon: Accelerating Federated Learning with Matrix Orthogonalization

Liu, Junkang, Shang, Fanhua, Zhou, Junchao, Liu, Hongying, Liu, Yuanyuan, Liu, Jin

arXiv.org Artificial IntelligenceNov-3-2025

The core bottleneck of Federated Learning (FL) lies in the communication rounds. That is, how to achieve more effective local updates is crucial for reducing communication rounds. Existing FL methods still primarily use element-wise local optimizers (Adam/SGD), neglecting the geometric structure of the weight matrices. This often leads to the amplification of pathological directions in the weights during local updates, leading deterioration in the condition number and slow convergence. Therefore, we introduce the Muon optimizer in local, which has matrix orthogonalization to optimize matrix-structured parameters. Experimental results show that, in IID setting, Local Muon significantly accelerates the convergence of FL and reduces communication rounds compared to Local SGD and Local AdamW. However, in non-IID setting, independent matrix orthogonalization based on the local distributions of each client induces strong client drift. Applying Muon in non-IID FL poses significant challenges: (1) client preconditioner leading to client drift; (2) moment reinitialization. To address these challenges, we propose a novel Federated Muon optimizer (FedMuon), which incorporates two key techniques: (1) momentum aggregation, where clients use the aggregated momentum for local initialization; (2) local-global alignment, where the local gradients are aligned with the global update direction to significantly reduce client drift. Theoretically, we prove that \texttt{FedMuon} achieves a linear speedup convergence rate without the heterogeneity assumption, where $S$ is the number of participating clients per round, $K$ is the number of local iterations, and $R$ is the total number of communication rounds. Empirically, we validate the effectiveness of FedMuon on language and vision models. Compared to several baselines, FedMuon significantly reduces communication rounds and improves test accuracy.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.27403

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Virginia (0.04)
Europe > Switzerland (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

An Efficient Subspace Algorithm for Federated Learning on Heterogeneous Data

Zhang, Jiaojiao, Xu, Yuqi, Yuan, Kun

arXiv.org Artificial IntelligenceSep-8-2025

This work addresses the key challenges of applying federated learning to large-scale deep neural networks, particularly the issue of client drift due to data heterogeneity across clients and the high costs of communication, computation, and memory. We propose FedSub, an efficient subspace algorithm for federated learning on heterogeneous data. Specifically, FedSub utilizes subspace projection to guarantee local updates of each client within low-dimensional subspaces, thereby reducing communication, computation, and memory costs. Additionally, it incorporates low-dimensional dual variables to mitigate client drift. We provide convergence analysis that reveals the impact of key factors such as step size and subspace projection matrices on convergence. Experimental results demonstrate its efficiency.

artificial intelligence, client drift, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.05213

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

FedEve: On Bridging the Client Drift and Period Drift for Cross-device Federated Learning

Shen, Tao, Li, Zexi, Zhu, Didi, Zhao, Ziyu, Wu, Chao, Wu, Fei

arXiv.org Artificial IntelligenceAug-21-2025

Federated learning (FL) is a machine learning paradigm that allows multiple clients to collaboratively train a shared model without exposing their private data. Data heterogeneity is a fundamental challenge in FL, which can result in poor convergence and performance degradation. Client drift has been recognized as one of the factors contributing to this issue resulting from the multiple local updates in FedAvg. However, in cross-device FL, a different form of drift arises due to the partial client participation, but it has not been studied well. This drift, we referred as period drift, occurs as participating clients at each communication round may exhibit distinct data distribution that deviates from that of all clients. It could be more harmful than client drift since the optimization objective shifts with every round. In this paper, we investigate the interaction between period drift and client drift, finding that period drift can have a particularly detrimental effect on cross-device FL as the degree of data heterogeneity increases. To tackle these issues, we propose a predict-observe framework and present an instantiated method, FedEve, where these two types of drift can compensate each other to mitigate their overall impact. We provide theoretical evidence that our approach can reduce the variance of model updates. Extensive experiments demonstrate that our method outperforms alternatives on non-iid data in cross-device settings.

artificial intelligence, machine learning, period drift, (16 more...)

arXiv.org Artificial Intelligence

2508.14539

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Virginia (0.04)
Asia > China (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

FedAPM: Federated Learning via ADMM with Partial Model Personalization

Zhu, Shengkun, Nie, Feiteng, Zeng, Jinshan, Wang, Sheng, Sun, Yuan, Yao, Yuan, Chen, Shangfeng, Xu, Quanqing, Yang, Chuanhui

arXiv.org Artificial IntelligenceJun-6-2025

In federated learning (FL), the assumption that datasets from different devices are independent and identically distributed (i.i.d.) often does not hold due to user differences, and the presence of various data modalities across clients makes using a single model impractical. Personalizing certain parts of the model can effectively address these issues by allowing those parts to differ across clients, while the remaining parts serve as a shared model. However, we found that partial model personalization may exacerbate client drift (each client's local model diverges from the shared model), thereby reducing the effectiveness and efficiency of FL algorithms. We propose an FL framework based on the alternating direction method of multipliers (ADMM), referred to as FedAPM, to mitigate client drift. We construct the augmented Lagrangian function by incorporating first-order and second-order proximal terms into the objective, with the second-order term providing fixed correction and the first-order term offering compensatory correction between the local and shared models. Our analysis demonstrates that FedAPM, by using explicit estimates of the Lagrange multiplier, is more stable and efficient in terms of convergence compared to other FL frameworks. We establish the global convergence of FedAPM training from arbitrary initial points to a stationary point, achieving three types of rates: constant, linear, and sublinear, under mild assumptions. We conduct experiments using four heterogeneous and multimodal datasets with different metrics to validate the performance of FedAPM. Specifically, FedAPM achieves faster and more accurate convergence, outperforming the SOTA methods with average improvements of 12.3% in test accuracy, 16.4% in F1 score, and 18.0% in AUC while requiring fewer communication rounds.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2506.04672

Country:

North America > Canada > Ontario > Toronto (0.05)
Asia > China > Hubei Province > Wuhan (0.04)
North America > United States > Virginia (0.04)
(8 more...)

Genre:

Overview (0.92)
Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Kuramoto-FedAvg: Using Synchronization Dynamics to Improve Federated Learning Optimization under Statistical Heterogeneity

Muhebwa, Aggrey, Selialia, Khotso, Anwar, Fatima, Osman, Khalid K.

arXiv.org Artificial IntelligenceMay-27-2025

Federated learning on heterogeneous (non-IID) client data experiences slow convergence due to client drift. To address this challenge, we propose Kuramoto-FedAvg, a federated optimization algorithm that reframes the weight aggregation step as a synchronization problem inspired by the Kuramoto model of coupled oscillators. The server dynamically weighs each client's update based on its phase alignment with the global update, amplifying contributions that align with the global gradient direction while minimizing the impact of updates that are out of phase. We theoretically prove that this synchronization mechanism reduces client drift, providing a tighter convergence bound compared to the standard FedAvg under heterogeneous data distributions. Empirical validation supports our theoretical findings, showing that Kuramoto-FedAvg significantly accelerates convergence and improves accuracy across multiple benchmark datasets. Our work highlights the potential of coordination and synchronization-based strategies for managing gradient diversity and accelerating federated optimization in realistic non-IID settings.

artificial intelligence, client drift, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.19605

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
North America > United States > Virginia (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Non-convex composite federated learning with heterogeneous data

Zhang, Jiaojiao, Hu, Jiang, Johansson, Mikael

arXiv.org Artificial IntelligenceFeb-6-2025

We propose an innovative algorithm for non-convex composite federated learning that decouples the proximal operator evaluation and the communication between server and clients. Moreover, each client uses local updates to communicate less frequently with the server, sends only a single d-dimensional vector per communication round, and overcomes issues with client drift. In the analysis, challenges arise from the use of decoupling strategies and local updates in the algorithm, as well as from the non-convex and non-smooth nature of the problem. We establish sublinear and linear convergence to a bounded residual error under general non-convexity and the proximal Polyak-Lojasiewicz inequality, respectively. In the numerical experiments, we demonstrate the superiority of our algorithm over state-of-the-art methods on both synthetic and real datasets.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.03958

Country:

North America > United States > Virginia (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Federated Learning with Sample-level Client Drift Mitigation

Xu, Haoran, Li, Jiaze, Wu, Wanyi, Ren, Hao

arXiv.org Artificial IntelligenceJan-20-2025

Federated Learning (FL) suffers from severe performance degradation due to the data heterogeneity among clients. Existing works reveal that the fundamental reason is that data heterogeneity can cause client drift where the local model update deviates from the global one, and thus they usually tackle this problem from the perspective of calibrating the obtained local update. Despite effectiveness, existing methods substantially lack a deep understanding of how heterogeneous data samples contribute to the formation of client drift. In this paper, we bridge this gap by identifying that the drift can be viewed as a cumulative manifestation of biases present in all local samples and the bias between samples is different. Besides, the bias dynamically changes as the FL training progresses. Motivated by this, we propose FedBSS that first mitigates the heterogeneity issue in a sample-level manner, orthogonal to existing methods. Specifically, the core idea of our method is to adopt a bias-aware sample selection scheme that dynamically selects the samples from small biases to large epoch by epoch to train progressively the local model in each round. In order to ensure the stability of training, we set the diversified knowledge acquisition stage as the warm-up stage to avoid the local optimality caused by knowledge deviation in the early stage of the model. Evaluation results show that FedBSS outperforms state-of-the-art baselines. In addition, we also achieved effective results on feature distribution skew and noise label dataset setting, which proves that FedBSS can not only reduce heterogeneity, but also has scalability and robustness.

artificial intelligence, federated learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2501.1136

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Federated-Continual Dynamic Segmentation of Histopathology guided by Barlow Continuity

Babendererde, Niklas, Zhu, Haozhe, Fuchs, Moritz, Stieber, Jonathan, Mukhopadhyay, Anirban

arXiv.org Artificial IntelligenceJan-8-2025

Federated- and Continual Learning have been established as approaches to enable privacy-aware learning on continuously changing data, as required for deploying AI systems in histopathology images. However, data shifts can occur in a dynamic world, spatially between institutions and temporally, due to changing data over time. This leads to two issues: Client Drift, where the central model degrades from aggregating data from clients trained on shifted data, and Catastrophic Forgetting, from temporal shifts such as changes in patient populations. Both tend to degrade the model's performance of previously seen data or spatially distributed training. Despite both problems arising from the same underlying problem of data shifts, existing research addresses them only individually. In this work, we introduce a method that can jointly alleviate Client Drift and Catastrophic Forgetting by using our proposed Dynamic Barlow Continuity that evaluates client updates on a public reference dataset and uses this to guide the training process to a spatially and temporally shift-invariant model. We evaluate our approach on the histopathology datasets BCSS and Semicol and prove our method to be highly effective by jointly improving the dice score as much as from 15.8% to 71.6% in Client Drift and from 42.5% to 62.8% in Catastrophic Forgetting. This enables Dynamic Learning by establishing spatio-temporal shift-invariance.

artificial intelligence, dynbc, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.04588

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (0.69)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

C2A: Client-Customized Adaptation for Parameter-Efficient Federated Learning

Kim, Yeachan, Kim, Junho, Mok, Wing-Lam, Park, Jun-Hyung, Lee, SangKeun

arXiv.org Artificial IntelligenceOct-31-2024

Despite the versatility of pre-trained language models (PLMs) across domains, their large memory footprints pose significant challenges in federated learning (FL), where the training model has to be distributed between a server and clients. One potential solution to bypass such constraints might be the use of parameter-efficient fine-tuning (PEFT) in the context of FL. However, we have observed that typical PEFT tends to severely suffer from heterogeneity among clients in FL scenarios, resulting in unstable and slow convergence. In this paper, we propose Client-Customized Adaptation (C2A), a novel hypernetwork-based FL framework that generates client-specific adapters by conditioning the client information. With the effectiveness of the hypernetworks in generating customized weights through learning to adopt the different characteristics of inputs, C2A can maximize the utility of shared model parameters while minimizing the divergence caused by client heterogeneity. To verify the efficacy of C2A, we perform extensive evaluations on FL scenarios involving heterogeneity in label and language distributions. Comprehensive evaluation results clearly support the superiority of C2A in terms of both efficiency and effectiveness in FL scenarios.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.00311

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback