AITopics | drift detector

Collaborating Authors

drift detector

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Prepared for the Unknown: Adapting AIOps Capacity Forecasting Models to Data Changes

Poenaru-Olaru, Lorena, Hof, Wouter van 't, Stando, Adrian, Trawinski, Arkadiusz P., Kapel, Eileen, Rellermeyer, Jan S., Cruz, Luis, van Deursen, Arie

arXiv.org Artificial IntelligenceOct-14-2025

Abstract--Capacity management is critical for software organizations to allocate resources effectively and meet operational demands. An important step in capacity management is predicting future resource needs often relies on data-driven analytics and machine learning (ML) forecasting models, which require frequent retraining to stay relevant as data evolves. Continuously retraining the forecasting models can be expensive and difficult to scale, posing a challenge for engineering teams tasked with balancing accuracy and efficiency. Retraining only when the data changes appears to be a more computationally efficient alternative, but its impact on accuracy requires further investigation. In this work, we investigate the effects of retraining capacity forecasting models for time series based on detected changes in the data compared to periodic retraining. Our results show that drift-based retraining achieves comparable forecasting accuracy to periodic retraining in most cases, making it a cost-effective strategy. However, in cases where data is changing rapidly, periodic retraining is still preferred to maximize the forecasting accuracy. These findings offer actionable insights for software teams to enhance forecasting systems, reducing retraining overhead while maintaining robust performance. The term capacity management refers to ensuring that an IT service has sufficient infrastructure and resources to meet the current or future demand. Although capacity management is crucial to ensure efficient and effective service delivery, this process used to be carried on manually by continuously collecting and analyzing data [32]. Manual techniques to predict the capacity requirements become difficult to scale as the capacity management data sources increase, and it is significantly time-consuming for the engineers in charge. To automate the capacity management for machine utilization, like CPU and memory, companies have started employing forecasting AIOps models, which predict the resource demand in a timely fashion. This is particularly relevant for our industry partner, ING (International Netherlands Group) Bank, where operational engineers must monitor numerous time series to ensure sufficient resources are allocated for its large-scale online operations, supported by thousands of machines with varying resource demands.

data mining, forecasting model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.1032

Country:

North America > United States (0.14)
Europe > Netherlands > South Holland > Delft (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Services (0.88)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts

Aspis, Miguel, Ordónez, Sebastián A. Cajas, Suárez-Cetrulo, Andrés L., Carbajo, Ricardo Simón

arXiv.org Machine LearningJul-25-2025

Learning from non-stationary data streams subject to concept drift requires models that can adapt on-the-fly while remaining resource-efficient. Existing adaptive ensemble methods often rely on coarse-grained adaptation mechanisms or simple voting schemes that fail to optimally leverage specialized knowledge. This paper introduces DriftMoE, an online Mixture-of-Experts (MoE) architecture that addresses these limitations through a novel co-training framework. DriftMoE features a compact neural router that is co-trained alongside a pool of incremental Hoeffding tree experts. The key innovation lies in a symbiotic learning loop that enables expert specialization: the router selects the most suitable expert for prediction, the relevant experts update incrementally with the true label, and the router refines its parameters using a multi-hot correctness mask that reinforces every accurate expert. This feedback loop provides the router with a clear training signal while accelerating expert specialization. We evaluate DriftMoE's performance across nine state-of-the-art data stream learning benchmarks spanning abrupt, gradual, and real-world drifts testing two distinct configurations: one where experts specialize on data regimes (multi-class variant), and another where they focus on single-class specialization (task-based variant). Our results demonstrate that DriftMoE achieves competitive results with state-of-the-art stream learning adaptive ensembles, offering a principled and efficient approach to concept drift adaptation.

artificial intelligence, driftmoe, machine learning, (20 more...)

arXiv.org Machine Learning

2507.18464

Country:

Oceania > Australia > New South Wales (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Education > Educational Setting > Online (0.47)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

IncA-DES: An incremental and adaptive dynamic ensemble selection approach using online K-d tree neighborhood search for data streams with concept drift

Barboza, Eduardo V. L., de Almeida, Paulo R. Lisboa, Britto, Alceu de Souza Jr., Sabourin, Robert, Cruz, Rafael M. O.

arXiv.org Artificial IntelligenceJul-18-2025

Data streams pose challenges not usually encountered in batch-based ML. One of them is concept drift, which is characterized by the change in data distribution over time. Among many approaches explored in literature, the fusion of classifiers has been showing good results and is getting growing attention. DS methods, due to the ensemble being instance-based, seem to be an efficient choice under drifting scenarios. However, some attention must be paid to adapting such methods for concept drift. The training must be done in order to create local experts, and the commonly used neighborhood-search DS may become prohibitive with the continuous arrival of data. In this work, we propose IncA-DES, which employs a training strategy that promotes the generation of local experts with the assumption that different regions of the feature space become available with time. Additionally, the fusion of a concept drift detector supports the maintenance of information and adaptation to a new concept. An overlap-based classification filter is also employed in order to avoid using the DS method when there is a consensus in the neighborhood, a strategy that we argue every DS method should employ, as it was shown to make them more applicable and quicker. Moreover, aiming to reduce the processing time of the kNN, we propose an Online K-d tree algorithm, which can quickly remove instances without becoming inconsistent and deals with unbalancing concerns that may occur in data streams. Experimental results showed that the proposed framework got the best average accuracy compared to seven state-of-the-art methods considering different levels of label availability and presented the smaller processing time between the most accurate methods. Additionally, the fusion with the Online K-d tree has improved processing time with a negligible loss in accuracy. We have made our framework available in an online repository.

artificial intelligence, concept drift, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.inffus.2025.103272

2507.12573

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > New York > New York County > New York City (0.04)
South America > Brazil > Paraná > Curitiba (0.04)
(5 more...)

Genre: Research Report > New Finding (0.54)

Industry:

Information Technology > Security & Privacy (0.67)
Education > Educational Setting > Online (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Sustainable Machine Learning Retraining: Optimizing Energy Efficiency Without Compromising Accuracy

Poenaru-Olaru, Lorena, Sallou, June, Cruz, Luis, Rellermeyer, Jan, van Deursen, Arie

arXiv.org Artificial IntelligenceJun-18-2025

--The reliability of machine learning (ML) software systems is heavily influenced by changes in data over time. For that reason, ML systems require regular maintenance, typically based on model retraining. However, retraining requires significant computational demand, which makes it energy-intensive and raises concerns about its environmental impact. T o understand which retraining techniques should be considered when designing sustainable ML applications, in this work, we study the energy consumption of common retraining techniques. Since the accuracy of ML systems is also essential, we compare retraining techniques in terms of both energy efficiency and accuracy. We showcase that retraining with only the most recent data, compared to all available data, reduces energy consumption by up to 25%, being a sustainable alternative to the status quo. Furthermore, our findings show that retraining a model only when there is evidence that updates are necessary, rather than on a fixed schedule, can reduce energy consumption by up to 40%, provided a reliable data change detector is in place. Our findings pave the way for better recommendations for ML practitioners, guiding them toward more energy-efficient retraining techniques when designing sustainable ML software systems. The increasing adoption of Machine Learning (ML) and Artificial Intelligence (AI) within organizations has resulted in the development of more ML/AI software systems [1]. Although ML/AI brings plenty of business value, it is known that the accuracy of ML applications decreases over time [2]. Thus, ML developers must monitor and maintain their ML systems in production. One reason for this phenomenon is the fact that ML applications are highly dependent on the data on which they have been trained. Real-world data usually changes over time [3] - a phenomenon often referred to as concept drift [4] - which can significantly impact the normal operation of ML systems [5]. Therefore, appropriate maintenance techniques are required for the design of ML software systems. One common approach to maintaining these systems is to periodically update these applications by retraining the underlying ML models with the latest version of the data [6], [7]. On another note, the process of training machine learning models has raised substantial concerns about the carbon footprint of ML applications [8], [9].

artificial intelligence, drift detector, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2506.13838

Country:

North America > United States (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Europe > Germany > Lower Saxony > Hanover (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry:

Energy (1.00)
Information Technology > Services (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Synthetic Non-stationary Data Streams for Recognition of the Unknown

Komorniczak, Joanna

arXiv.org Machine LearningMay-21-2025

The problem of data non-stationarity is commonly addressed in data stream processing. In a dynamic environment, methods should continuously be ready to analyze time-varying data -- hence, they should enable incremental training and respond to concept drifts. An equally important variability typical for non-stationary data stream environments is the emergence of new, previously unknown classes. Often, methods focus on one of these two phenomena -- detection of concept drifts or detection of novel classes -- while both difficulties can be observed in data streams. Additionally, concerning previously unknown observations, the topic of open set of classes has become particularly important in recent years, where the goal of methods is to efficiently classify within known classes and recognize objects outside the model competence. This article presents a strategy for synthetic data stream generation in which both concept drifts and the emergence of new classes representing unknown objects occur. The presented research shows how unsupervised drift detectors address the task of detecting novelty and concept drifts and demonstrates how the generated data streams can be utilized in the open set recognition task.

concept drift, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

2505.13745

Country:

Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
South America > Brazil > Maranhão (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.32)

Add feedback

Early Concept Drift Detection via Prediction Uncertainty

Lu, Pengqian, Lu, Jie, Liu, Anjin, Zhang, Guangquan

arXiv.org Artificial IntelligenceDec-15-2024

Concept drift, characterized by unpredictable changes in data distribution over time, poses significant challenges to machine learning models in streaming data scenarios. Although error rate-based concept drift detectors are widely used, they often fail to identify drift in the early stages when the data distribution changes but error rates remain constant. This paper introduces the Prediction Uncertainty Index (PU-index), derived from the prediction uncertainty of the classifier, as a superior alternative to the error rate for drift detection. Our theoretical analysis demonstrates that: (1) The PU-index can detect drift even when error rates remain stable. (2) Any change in the error rate will lead to a corresponding change in the PU-index. These properties make the PU-index a more sensitive and robust indicator for drift detection compared to existing methods. We also propose a PU-index-based Drift Detector (PUDD) that employs a novel Adaptive PU-index Bucketing algorithm for detecting drift. Empirical evaluations on both synthetic and real-world datasets demonstrate PUDD's efficacy in detecting drift in structured and image data.

artificial intelligence, concept drift, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.11158

Country:

South America > Brazil > Maranhão (0.04)
Oceania > Australia > New South Wales (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Adversarial Attacks for Drift Detection

Hinder, Fabian, Vaquet, Valerie, Hammer, Barbara

arXiv.org Machine LearningNov-25-2024

Data from the real world is often subject to continuous changes known as concept drift [1, 2, 3]. Such can be caused by seasonal changes, changed demands, aging of sensors, etc. Concept drift not only poses a problem for maintaining high performance in learning models [2, 3] but also plays a crucial role in system monitoring [1]. In the latter case, the detection of concept drift is crucial as it enables the detection of anomalous behavior. Examples include machine malfunctions or failures, network security, environmental changes, and critical infrastructures. This is done by detecting irregular shifts [4, 1, 5]. In these contexts, the ability to robustly detect drift is essential. In addition to problems such as noise and sampling error, which challenge all statistical methods, drift detection faces a special kind of difficulty when the drift follows certain patterns that evade detection. In this work, we study those specific drifts that we will refer to as "drift adversarials". Similar to adversarial attacks, drift adversarials exploit weaknesses in the detection methods, and thus allow significant concept drift to occur without triggering alarms posing major issues for monitoring systems.

artificial intelligence, drift adversarial, machine learning, (16 more...)

arXiv.org Machine Learning

2411.16591

Country: Europe > Germany (0.04)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.71)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Structuring the Processing Frameworks for Data Stream Evaluation and Application

Komorniczak, Joanna, Ksieniewicz, Paweł, Zyblewski, Paweł

arXiv.org Artificial IntelligenceNov-11-2024

The following work addresses the problem of frameworks for data stream processing that can be used to evaluate the solutions in an environment that resembles real-world applications. The definition of structured frameworks stems from a need to reliably evaluate the data stream classification methods, considering the constraints of delayed and limited label access. The current experimental evaluation often boundlessly exploits the assumption of their complete and immediate access to monitor the recognition quality and to adapt the methods to the changing concepts. The problem is leveraged by reviewing currently described methods and techniques for data stream processing and verifying their outcomes in simulated environment. The effect of the work is a proposed taxonomy of data stream processing frameworks, showing the linkage between drift detection and classification methods considering a natural phenomenon of label delay.

artificial intelligence, data stream, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2411.06799

Country:

Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
South America > Brazil > Maranhão (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

SUDS: A Strategy for Unsupervised Drift Sampling

Fellicious, Christofer, Wendlinger, Lorenz, Gancarski, Mario, Mitrovic, Jelena, Granitzer, Michael

arXiv.org Artificial IntelligenceNov-5-2024

Supervised machine learning often encounters concept drift, where the data distribution changes over time, degrading model performance. Existing drift detection methods focus on identifying these shifts but often overlook the challenge of acquiring labeled data for model retraining after a shift occurs. We present the Strategy for Drift Sampling (SUDS), a novel method that selects homogeneous samples for retraining using existing drift detection algorithms, thereby enhancing model adaptability to evolving data. SUDS seamlessly integrates with current drift detection techniques. We also introduce the Harmonized Annotated Data Accuracy Metric (HADAM), a metric that evaluates classifier performance in relation to the quantity of annotated data required to achieve the stated performance, thereby taking into account the difficulty of acquiring labeled data. Our contributions are twofold: SUDS combines drift detection with strategic sampling to improve the retraining process, and HADAM provides a metric that balances classifier performance with the amount of labeled data, ensuring efficient resource utilization. Empirical results demonstrate the efficacy of SUDS in optimizing labeled data use in dynamic environments, significantly improving the performance of machine learning applications in real-world scenarios. Our code is open source and available at https://github.com/cfellicious/SUDS/

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.02995

Country:

South America > Brazil (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.48)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.46)
Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Drift Detection: Introducing Gaussian Split Detector

Fuccellaro, Maxime, Simon, Laurent, Zemmari, Akka

arXiv.org Artificial IntelligenceMay-14-2024

Recent research yielded a wide array of drift detectors. However, in order to achieve remarkable performance, the true class labels must be available during the drift detection phase. This paper targets at detecting drift when the ground truth is unknown during the detection phase. To that end, we introduce Gaussian Split Detector (GSD) a novel drift detector that works in batch mode. GSD is designed to work when the data follow a normal distribution and makes use of Gaussian mixture models to monitor changes in the decision boundary. The algorithm is designed to handle multi-dimension data streams and to work without the ground truth labels during the inference phase making it pertinent for real world use. In an extensive experimental study on real and synthetic datasets, we evaluate our detector against the state of the art. We show that our detector outperforms the state of the art in detecting real drift and in ignoring virtual drift which is key to avoid false alarms.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2405.08637

Country:

North America > United States > California > Orange County > Irvine (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)

Add feedback