Goto

Collaborating Authors

A Typology of Data Anomalies

arXiv.org Artificial Intelligence

Anomalies are cases that are in some way unusual and do not appear to fit the general patterns present in the dataset. Several conceptualizations exist to distinguish between different types of anomalies. However, these are either too specific to be generally applicable or so abstract that they neither provide concrete insight into the nature of anomaly types nor facilitate the functional evaluation of anomaly detection algorithms. With the recent criticism on 'black box' algorithms and analytics it has become clear that this is an undesirable situation. This paper therefore introduces a general typology of anomalies that offers a clear and tangible definition of the different types of anomalies in datasets. The typology also facilitates the evaluation of the functional capabilities of anomaly detection algorithms and as a framework assists in analyzing the conceptual levels of data, patterns and anomalies. Finally, it serves as an analytical tool for studying anomaly types from other typologies.


How to Analyze Your Predictable Data: Anomaly Detection

@machinelearnbot

Imagine for a moment that you invented a time machine. It is a shiny device that allows you to travel to the future and see what is about to happen. The first time you use it, you arrive in the near future with all the whirring, clanking, and smoke you would expect from time travel. As you squint through the haze, you become convinced the machine didn't actually work, because the near future looks the same as the time you left. You are shortly pulled back to your departure time and slowly realize that the future it showed you is exactly what is happening.


Interpretation of Isolation Forest with SHAP

#artificialintelligence

Isolation Forest is one of the most used techniques to detect anomalies in the data. It's based on a "forest" of trees, where each isolation tree isolates anomalous observations from the rest of the data points. Despite its simplicity, speed and intuitiveness, there is a drawback. Why is a particular observation considered anomalous by the algorithm? How can the output be interpreted?


How Machine Learning Detects Anomalies in Healthcare

#artificialintelligence

The digital revolution has changed the healthcare landscape irrevocably. Patients expect faster, more efficient care that costs less, which is where artificial intelligence (AI) can help. AI and machine learning allow healthcare organizations to evolve and keep up with trends and new methodologies. Data science enables systems to ingest massive quantities of information quickly, to generate insights and predictions that allow healthcare organizations to focus human attention on what's really important: providing quality care. One of the techniques that are essential for data teams, physicians, insurance analysts, etc., in healthcare to understand is anomaly detection.


Out-of-Distribution Dynamics Detection: RL-Relevant Benchmarks and Results

arXiv.org Artificial Intelligence

We study the problem of out-of-distribution dynamics (OODD) detection, which involves detecting when the dynamics of a temporal process change compared to the training-distribution dynamics. This is relevant to applications in control, reinforcement learning (RL), and multi-variate time-series, where changes to test time dynamics can impact the performance of learning controllers/predictors in unknown ways. This problem is particularly important in the context of deep RL, where learned controllers often overfit to the training environment. Currently, however, there is a lack of established OODD benchmarks for the types of environments commonly used in RL research. Our first contribution is to design a set of OODD benchmarks derived from common RL environments with varying types and intensities of OODD. Our second contribution is to design a strong OODD baseline approach based on recurrent implicit quantile networks (RIQNs), which monitors autoregressive prediction errors for OODD detection. Our final contribution is to evaluate the RIQN approach on the benchmarks to provide baseline results for future comparison.