AITopics | data drift

Collaborating Authors

data drift

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Open-Source Drift Detection Tools in Action: Insights from Two Use Cases

Müller, Rieke, Abdelaal, Mohamed, Stjelja, Davor

arXiv.org Artificial IntelligenceMay-10-2024

Data drifts pose a critical challenge in the lifecycle of machine learning (ML) models, affecting their performance and reliability. In response to this challenge, we present a microbenchmark study, called D3Bench, which evaluates the efficacy of open-source drift detection tools. D3Bench examines the capabilities of Evidently AI, NannyML, and Alibi-Detect, leveraging real-world data from two smart building use cases.We prioritize assessing the functional suitability of these tools to identify and analyze data drifts. Furthermore, we consider a comprehensive set of non-functional criteria, such as the integrability with ML pipelines, the adaptability to diverse data types, user-friendliness, computational efficiency, and resource demands. Our findings reveal that Evidently AI stands out for its general data drift detection, whereas NannyML excels at pinpointing the precise timing of shifts and evaluating their consequent effects on predictive accuracy.

dataset, evidently ai, nannyml, (13 more...)

arXiv.org Artificial Intelligence

2404.18673

Country:

North America > United States > District of Columbia > Washington (0.05)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (0.71)
Information Technology (0.48)
Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

MLOps with enhanced performance control and observability

Banerjee, Indradumna, Ghanta, Dinesh, Nautiyal, Girish, Sanchana, Pradeep, Katageri, Prateek, Modi, Atin

arXiv.org Artificial IntelligenceFeb-2-2023

The explosion of data and its ever increasing complexity in the last few years, has made MLOps systems more prone to failure, and new tools need to be embedded in such systems to avoid such failure. In this demo, we will introduce crucial tools in the observability module of a MLOps system that target difficult issues like data drfit and model version control for optimum model selection. We believe integrating these features in our MLOps pipeline would go a long way in building a robust system immune to early stage ML system failures.

artificial intelligence, machine learning, mlop system, (13 more...)

arXiv.org Artificial Intelligence

2302.01061

Country: North America > United States > California > Alameda County > Berkeley (0.05)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MLOps -- Understanding Data Drift. Types of Data Drifts and Monitoring…

#artificialintelligenceJan-9-2023, 20:15:43 GMT

One of the important functions of MLOps engineers is to monitor the model performance. Data drift causes degradation in the model performance over a period of time. Let's discuss data drift and the steps we can take to detect it in detail. Data drift refers to changes in the data distribution over a period of time. Data drift can lead to poor model performance, because the model is being applied to data that is different from the data it was trained on.

artificial intelligence, machine learning, training and test data, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.33)

Add feedback

Data Drift - Types, causes and measures.

#artificialintelligenceMar-16-2022, 22:00:05 GMT

In most big data analysis applications, data evolve over time and must be analyzed and treated in near real time. Patterns and interactions in such data often change over time, thus, models built for analyzing such data quickly become outdated over time. In machine learning and data mining this phenomenon is referred to as data drift. Data Drift Data drift is a change in the distribution of data over time. In machine learning models, data drift is the change in the distribution of a baseline data set on which the model was trained and the current real-time production data.

disadvantage, intersection, probability distribution, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Hamed ZITOUN on LinkedIn: #machinelearning

#artificialintelligenceNov-29-2021, 07:30:48 GMT

When data quality is fine, there are two usual suspects: data drift or concept drift. Data Drift -- The input data has changed. The distribution of the variables is meaningfully different. As a result, the trained model is not relevant for this new data. Concept Drift -- In contrast to the data drift, the distributions might even remain the same.

concept drift, hamed zitoun, linkedin, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.85)

Add feedback

Deployment ML-OPS Guide Series - 2

#artificialintelligenceSep-9-2021, 13:32:17 GMT

The most exciting moment of any machine learning system is when you get to deploy your model, but deploying becomes hard due to statistical issues such as "when past model performance is no more guaranteed for future and model performance degrade over a period of time due to changes of data when the model is deployed in a cloud with frequent data changes" and system engine such as system demands monitoring the ML system often which is manual in nature and tedious which needs to be handled through automation as much as possible. Now, How to deal with the statistical issue or degrading performance of the model?. How to handle the data changes once the model is deployed? That is where Concept and Data drift comes into the picture. Concept Drift refers to if the desired mapping from x to y changes and it leads to inaccurate predictions due to huge data distribution changes in the productized model.

deployment ml-op guide series, house property, online purchase, (7 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Four Ways to Build MLOps that Avoid "Data Drift" in Machine Learning - IEEE Innovation at Work

#artificialintelligenceAug-24-2021, 19:31:17 GMT

MLOps isn't an algorithm, but it does operationalize the algorithm to simplify the predictive process,

data drift, ieee innovation, machine learning, (2 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Avoid These Data Pitfalls When Moving Machine Learning Applications Into Production

#artificialintelligenceJun-22-2021, 14:10:50 GMT

How often have you heard "The Machine Learning Application worked well in the lab, but it failed in the field. It is not the fault of the Machine Learning Model! This blog is not yet another blog article (YABA) on DataOps, DevOps, MLOps, or CloudOps. I do not mean to imply xOps is not essential. For example, MLOps is both strategic and tactical. It promises to transform the "ad-hoc" delivery of Machine Learning applications into software engineering best practices. We know the symptoms: Most machine-learning models trained in the lab perform poorly on real-world data [1, 2, 3, 4]. Machine Learning created profits in the year 2020 and will continue to increase profits in the future. However, many problems hold back the progress and success of Machine Learning application rollout to production. I focus on what it is the most significant problem or cause: the quality and quantity of input data in Machine Learning models [1,4]. We realized the quantity of high-quality data was the bottleneck in predictive accuracy when we started showing near, or above, human-level performance in structured data, imagery, game playing, and natural language tasks. How many times do we look at the Machine Learning application lifecycle's conceptualization to realize a Machine Learning model is not at the beginning (Figure 2)? We can research and improve the tools of the Machine Learning application lifecycle. But that only lowers the cost of deployment. Arguably, the Machine Learning model's choice is not a critical part of deploying a Machine Learning application. We have a "good enough" process or pipeline to choose and change the Machine Learning model, given a training input dataset. However, when achieving State-of-the-Art (SOTA) results, the input data seems to have the most significant impact on the output predictive data (Figure 2). We seem to know the cause: input data that was garbage results in garbage output predictive data. New data input to a trained Machine Learning model determines the accuracy of the output. We divide Machine Learning input data into four arbitrary categories, defined by the Machine Learning application output accuracy. GPT-3 is an example [6]. GPT-3 trained with an enormous amount of data [6]. GPT-3 is frozen in time as a transformer that you access through an API. Concept Drift is a change in what to predict. For example, the definition of "what is a spammer." We do not cover Concept Drift here. I do not think of it as a problem but rather as a change in the solution's scope. An example of Case 2: Data Drift, is that Case 1: "It works!, is a temporal phenomenon.

application, learning application, machine learning application, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Add feedback

Data Drift and Machine Learning Model Sustainability

#artificialintelligenceOct-29-2020, 11:18:37 GMT

In many real-world applications where machine learning models have been deployed in production, often the data evolve over time and thus models built for analyzing such data quickly become obsolete over time. It becomes essential for data scientists to monitor the model performance over time. Is the machine learning model deployed sustainable and performing consistently? Usually, the scenario that occurs over time, is not because the model stops performing well but simply because the model can no longer capture the right variability of the data be it the dependent or independent variables. The reason for this is not to do with the machine learning model itself, but the data distributions.

artificial intelligence, data drift, machine learning model sustainability

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Primer on Data Drift & Drift Detection Techniques

#artificialintelligenceOct-8-2020, 08:18:50 GMT

After obtaining a PhD in Biomedical Image Processing in 2011, Simona Maggio worked in several companies (CEA, Thales, Rakuten) as a Research Engineer in Computer Vision and Natural Language Processing for applications ranging from video surveillance to document digitization and e-commerce.

artificial intelligence, drift & drift detection technique, natural language, (2 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.86)

Add feedback