AITopics | model drift

Country: North America > Canada > Ontario > Toronto (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Neural Information Processing SystemsDec-26-2025, 14:12:26 GMT

Hierarchical Federated Learning with Multi-Timescale Gradient Correction

While traditional federated learning (FL) typically focuses on a star topology where clients are directly connected to a central server, real-world distributed systems often exhibit hierarchical architectures. Hierarchical FL (HFL) has emerged as a promising solution to bridge this gap, leveraging aggregation points at multiple levels of the system. However, existing algorithms for HFL encounter challenges in dealing with multi-timescale model drift, i.e., model drift occurring across hierarchical levels of data heterogeneity. In this paper, we propose a multi-timescale gradient correction (MTGC) methodology to resolve this issue. Our key idea is to introduce distinct control variables to (i) correct the client gradient towards the group gradient, i.e., to reduce client model drift caused by local updates based on individual datasets, and (ii) correct the group gradient towards the global gradient, i.e., to reduce group model drift caused by FL over clients within the group. We analytically characterize the convergence behavior of MTGC under general non-convex settings, overcoming challenges associated with couplings between correction terms. We show that our convergence bound is immune to the extent of data heterogeneity, confirming the stability of the proposed algorithm against multi-level non-i.i.d.

artificial intelligence, machine learning, proceedings, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Neural Information Processing SystemsOct-11-2025, 00:31:21 GMT

Hierarchical Federated Learning with Multi-Timescale Gradient Correction

We analytically characterize the convergence behavior of MTGC under general non-convex settings, overcoming challenges associated with couplings between correction terms.

algorithm, gradient, nullnull null 2, (15 more...)

Country: North America > Canada > Ontario > Toronto (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Lee, Chang-Hwan, Shim, Alexander

Detecting Model Drifts in Non-Stationary Environment Using Edit Operation Measures

arXiv.org Artificial IntelligenceSep-16-2025

Reinforcement learning (RL) agents typically assume stationary environment dynamics. Yet in real-world applications such as healthcare, robotics, and finance, transition probabilities or reward functions may evolve, leading to model drift. This paper proposes a novel framework to detect such drifts by analyzing the distributional changes in sequences of agent behavior. Specifically, we introduce a suite of edit operation-based measures to quantify deviations between state-action trajectories generated under stationary and perturbed conditions. Our experiments demonstrate that these measures can effectively distinguish drifted from non-drifted scenarios, even under varying levels of noise, providing a practical tool for drift detection in non-stationary RL environments.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2509.11367

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Florida > Hillsborough County > Tampa (0.04)

Genre:

Research Report > Experimental Study (0.95)
Research Report > New Finding (0.93)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Neural Information Processing SystemsMay-27-2025, 08:45:56 GMT

Hierarchical Federated Learning with Multi-Timescale Gradient Correction

While traditional federated learning (FL) typically focuses on a star topology where clients are directly connected to a central server, real-world distributed systems often exhibit hierarchical architectures. Hierarchical FL (HFL) has emerged as a promising solution to bridge this gap, leveraging aggregation points at multiple levels of the system. However, existing algorithms for HFL encounter challenges in dealing with multi-timescale model drift, i.e., model drift occurring across hierarchical levels of data heterogeneity. In this paper, we propose a multi-timescale gradient correction (MTGC) methodology to resolve this issue. Our key idea is to introduce distinct control variables to (i) correct the client gradient towards the group gradient, i.e., to reduce client model drift caused by local updates based on individual datasets, and (ii) correct the group gradient towards the global gradient, i.e., to reduce group model drift caused by FL over clients within the group. We analytically characterize the convergence behavior of MTGC under general non-convex settings, overcoming challenges associated with couplings between correction terms.

hierarchical federated learning, model drift, multi-timescale gradient correction, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.77)

arXiv.org Artificial IntelligenceMar-23-2025

FedSKD: Aggregation-free Model-heterogeneous Federated Learning using Multi-dimensional Similarity Knowledge Distillation

Weng, Ziqiao, Cai, Weidong, Zhou, Bo

Federated learning (FL) enables privacy-preserving collaborative model training without direct data sharing. Model-heterogeneous FL (MHFL) extends this paradigm by allowing clients to train personalized models with heterogeneous architectures tailored to their computational resources and application-specific needs. However, existing MHFL methods predominantly rely on centralized aggregation, which introduces scalability and efficiency bottlenecks, or impose restrictions requiring partially identical model architectures across clients. While peer-to-peer (P2P) FL removes server dependence, it suffers from model drift and knowledge dilution, limiting its effectiveness in heterogeneous settings. To address these challenges, we propose FedSKD, a novel MHFL framework that facilitates direct knowledge exchange through round-robin model circulation, eliminating the need for centralized aggregation while allowing fully heterogeneous model architectures across clients. FedSKD's key innovation lies in multi-dimensional similarity knowledge distillation, which enables bidirectional cross-client knowledge transfer at batch, pixel/voxel, and region levels for heterogeneous models in FL. This approach mitigates catastrophic forgetting and model drift through progressive reinforcement and distribution alignment while preserving model heterogeneity. Extensive evaluations on fMRI-based autism spectrum disorder diagnosis and skin lesion classification demonstrate that FedSKD outperforms state-of-the-art heterogeneous and homogeneous FL baselines, achieving superior personalization (client-specific accuracy) and generalization (cross-institutional adaptability). These findings underscore FedSKD's potential as a scalable and robust solution for real-world medical federated learning applications.

artificial intelligence, data mining, machine learning, (16 more...)

2503.18981

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(7 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Dermatology (0.89)
Health & Medicine > Therapeutic Area > Neurology > Autism (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Panda, Pranoy, Srinivas, Kancheti Sai, Balasubramanian, Vineeth N, Sinha, Gaurav

Interpretable Model Drift Detection

arXiv.org Artificial IntelligenceMar-9-2025

Data in the real world often has an evolving distribution. Thus, machine learning models trained on such data get outdated over time. This phenomenon is called model drift. Knowledge of this drift serves two purposes: (i) Retain an accurate model and (ii) Discovery of knowledge or insights about change in the relationship between input features and output variable w.r.t. the model. Most existing works focus only on detecting model drift but offer no interpretability. In this work, we take a principled approach to study the problem of interpretable model drift detection from a risk perspective using a feature-interaction aware hypothesis testing framework, which enjoys guarantees on test power. The proposed framework is generic, i.e., it can be adapted to both classification and regression tasks. Experiments on several standard drift detection datasets show that our method is superior to existing interpretable methods (especially on real-world datasets) and on par with state-of-the-art black-box drift detection methods. We also quantitatively and qualitatively study the interpretability aspect including a case study on USENET2 dataset. We find our method focuses on model and drift sensitive features compared to baseline interpretable drift detectors.

dataset, detection, drift detection, (13 more...)

doi: 10.1145/3632410.3632434

2503.06606

Country:

Asia > India > Karnataka > Bengaluru (0.05)
Asia > India > Telangana > Hyderabad (0.04)
Oceania > Australia > New South Wales (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.68)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Fang, Wenzhi, Han, Dong-Jun, Chen, Evan, Wang, Shiqiang, Brinton, Christopher G.

Hierarchical Federated Learning with Multi-Timescale Gradient Correction

arXiv.org Artificial IntelligenceDec-16-2024

While traditional federated learning (FL) typically focuses on a star topology where clients are directly connected to a central server, real-world distributed systems often exhibit hierarchical architectures. Hierarchical FL (HFL) has emerged as a promising solution to bridge this gap, leveraging aggregation points at multiple levels of the system. However, existing algorithms for HFL encounter challenges in dealing with multi-timescale model drift, i.e., model drift occurring across hierarchical levels of data heterogeneity. In this paper, we propose a multi-timescale gradient correction (MTGC) methodology to resolve this issue. Our key idea is to introduce distinct control variables to (i) correct the client gradient towards the group gradient, i.e., to reduce client model drift caused by local updates based on individual datasets, and (ii) correct the group gradient towards the global gradient, i.e., to reduce group model drift caused by FL over clients within the group. We analytically characterize the convergence behavior of MTGC under general non-convex settings, overcoming challenges associated with couplings between correction terms. We show that our convergence bound is immune to the extent of data heterogeneity, confirming the stability of the proposed algorithm against multi-level non-i.i.d.

artificial intelligence, gradient, machine learning, (16 more...)

2409.18448

Country: North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceJul-2-2024

FedEx: Expediting Federated Learning over Heterogeneous Mobile Devices by Overlapping and Participant Selection

Geng, Jiaxiang, Li, Boyu, Qin, Xiaoqi, Li, Yixuan, Li, Liang, Hou, Yanzhao, Pan, Miao

Training latency is critical for the success of numerous intrigued applications ignited by federated learning (FL) over heterogeneous mobile devices. By revolutionarily overlapping local gradient transmission with continuous local computing, FL can remarkably reduce its training latency over homogeneous clients, yet encounter severe model staleness, model drifts, memory cost and straggler issues in heterogeneous environments. To unleash the full potential of overlapping, we propose, FedEx, a novel \underline{fed}erated learning approach to \underline{ex}pedite FL training over mobile devices under data, computing and wireless heterogeneity. FedEx redefines the overlapping procedure with staleness ceilings to constrain memory consumption and make overlapping compatible with participation selection (PS) designs. Then, FedEx characterizes the PS utility function by considering the latency reduced by overlapping, and provides a holistic PS solution to address the straggler issue. FedEx also introduces a simple but effective metric to trigger overlapping, in order to avoid model drifts. Experimental results show that compared with its peer designs, FedEx demonstrates substantial reductions in FL training latency over heterogeneous mobile devices with limited memory cost.

fl training, latency, mobile device, (15 more...)

2407.00943

Country:

North America > United States > New York > New York County > New York City (0.14)
Asia > China > Beijing > Beijing (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(14 more...)

Genre: Research Report > New Finding (0.48)

Industry: Transportation > Freight & Logistics Services (1.00)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Ramírez, Camilo, Silva, Jorge F., Tamssaouet, Ferhat, Rojas, Tomás, Orchard, Marcos E.

Fault Detection and Monitoring using an Information-Driven Strategy: Method, Theory, and Application

arXiv.org Artificial IntelligenceMay-6-2024

The ability to detect when a system undergoes an incipient fault is of paramount importance in preventing a critical failure. In this work, we propose an information-driven fault detection method based on a novel concept drift detector. The method is tailored to identifying drifts in input-output relationships of additive noise models (i.e., model drifts) and is based on a distribution-free mutual information (MI) estimator. Our scheme does not require prior faulty examples and can be applied distribution-free over a large class of system models. Our core contributions are twofold. First, we demonstrate the connection between fault detection, model drift detection, and testing independence between two random variables. Second, we prove several theoretical properties of the proposed MI-based fault detection scheme: (i) strong consistency, (ii) exponentially fast detection of the non-faulty case, and (iii) control of both significance levels and power of the test. To conclude, we validate our theory with synthetic data and the benchmark dataset N-CMAPSS of aircraft turbofan engines. These empirical results support the usefulness of our methodology in many practical and realistic settings, and the theoretical results show performance guarantees that other methods cannot offer.

conséquence, correspond, detection, (16 more...)

2405.03667

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Energy (0.45)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)