Collaborating Authors



AnomalyDetection is an open-source R package to detect anomalies which is robust, from a statistical standpoint, in the presence of seasonality and an underlying trend. The AnomalyDetection package can be used in wide variety of contexts. The underlying algorithm – referred to as Seasonal Hybrid ESD (S-H-ESD) builds upon the Generalized ESD test for detecting anomalies. Note that S-H-ESD can be used to detect both global as well as local anomalies. This is achieved by employing time series decomposition and using robust statistical metrics, viz., median together with ESD.

Weakly-supervised Deep Anomaly Detection with Pairwise Relation Learning Machine Learning

This paper studies a rarely explored but critical anomaly detection problem: weakly-supervised anomaly detection with limited labeled anomalies and a large unlabeled data set. This problem is very important because it (i) enables anomaly-informed modeling which helps identify anomalies of interests and address the notorious high false positives in unsupervised anomaly detection, and (ii) eliminates the reliance on large-scale and complete labeled anomaly data in fully-supervised settings. However, the problem is especially challenging since we have only limited labeled data for a single class, and moreover, the seen anomalies often cannot cover all types of anomalies (i.e., unseen anomalies). We address this problem by formulating the problem as a pairwise relation learning task. Particularly, our approach defines a two-stream ordinal regression network to learn the relation of randomly selected instance pairs, i.e., whether the instance pair contains labeled anomalies or just unlabeled data instances. The resulting model leverages both the labeled and unlabeled data to effectively augment the data and learn generalized representations of both normality and abnormality. Extensive empirical results show that our approach (i) significantly outperforms state-of-the-art competing methods in detecting both seen and unseen anomalies and (ii) is substantially more data-efficient. Introduction Anomaly detection aims at identifying exceptional data instances that have a significant deviation from the majority of data instances, which can offer important insights into broad applications, such as identifying fraudulent transactions or insider trading, detecting network intrusions, and early detection of diseases.

Time Series of Price Anomaly Detection with LSTM


Autoencoders are an unsupervised learning technique, although they are trained using supervised learning methods. The goal is to minimize reconstruction error based on a loss function, such as the mean squared error. In this post, we will try to detect anomalies in the Johnson & Johnson's historical stock price time series data with an LSTM autoencoder.

Detecting Relative Anomaly Machine Learning

System states that are anomalous from the perspective of a domain expert occur frequently in some anomaly detection problems. The performance of commonly used unsupervised anomaly detection methods may suffer in that setting, because they use frequency as a proxy for anomaly. We propose a novel concept for anomaly detection, called relative anomaly detection. It is tailored to be robust towards anomalies that occur frequently, by taking into account their location relative to the most typical observations. The approaches we develop are computationally feasible even for large data sets, and they allow real-time detection. We illustrate using data sets of potential scraping attempts and Wi-Fi channel utilization, both from Google, Inc.

Group Anomaly Detection using Flexible Genre Models

Neural Information Processing Systems

An important task in exploring and analyzing real-world data sets is to detect unusual and interesting phenomena. In this paper, we study the group anomaly detection problem. Unlike traditional anomaly detection research that focuses on data points, our goal is to discover anomalous aggregated behaviors of groups of points. For this purpose, we propose the Flexible Genre Model (FGM). FGM is designed to characterize data groups at both the point level and the group level so as to detect various types of group anomalies. We evaluate the effectiveness of FGM on both synthetic and real data sets including images and turbulence data, and show that it is superior to existing approaches in detecting group anomalies.