Anomaly detection for time-series data has been an important research field for a long time. Seminal work on anomaly detection methods has been focussing on statistical approaches. In recent years an increasing number of machine learning algorithms have been developed to detect anomalies on time-series. Subsequently, researchers tried to improve these techniques using (deep) neural networks. In the light of the increasing number of anomaly detection methods, the body of research lacks a broad comparative evaluation of statistical, machine learning and deep learning methods. This paper studies 20 univariate anomaly detection methods from the all three categories. The evaluation is conducted on publicly available datasets, which serve as benchmarks for time-series anomaly detection. By analyzing the accuracy of each method as well as the computation time of the algorithms, we provide a thorough insight about the performance of these anomaly detection approaches, alongside some general notion of which method is suited for a certain type of data.
The Mahalanobis distance-based confidence score, a recently proposed anomaly detection method for pre-trained neural classifiers, achieves state-of-the-art performance on both out-of-distribution and adversarial example detection. This work analyzes why this method exhibits such strong performance while imposing an implausible assumption; namely, that class conditional distributions of intermediate features have tied covariance. We reveal that the reason for its effectiveness has been misunderstood. Although this method scores the prediction confidence for the original classification task, our analysis suggests that information critical for classification task does not contribute to state-of-the-art performance on anomaly detection. To support this hypothesis, we demonstrate that a simpler confidence score that does not use class information is as effective as the original method in most cases. Moreover, our experiments show that the confidence scores can exhibit different behavior on other frameworks such as metric learning models, and their detection performance is sensitive to model architecture choice. These findings provide insight into the behavior of neural classifiers when provided with anomalous inputs.
Although deep learning has been applied to successfully address many data mining problems, relatively limited work has been done on deep learning for anomaly detection. Existing deep anomaly detection methods, which focus on learning new feature representations to enable downstream anomaly detection methods, perform indirect optimization of anomaly scores, leading to data-inefficient learning and suboptimal anomaly scoring. Also, they are typically designed as unsupervised learning due to the lack of large-scale labeled anomaly data. As a result, they are difficult to leverage prior knowledge (e.g., a few labeled anomalies) when such information is available as in many real-world anomaly detection applications. This paper introduces a novel anomaly detection framework and its instantiation to address these problems. Instead of representation learning, our method fulfills an end-to-end learning of anomaly scores by a neural deviation learning, in which we leverage a few (e.g., multiple to dozens) labeled anomalies and a prior probability to enforce statistically significant deviations of the anomaly scores of anomalies from that of normal data objects in the upper tail. Extensive results show that our method can be trained substantially more data-efficiently and achieves significantly better anomaly scoring than state-of-the-art competing methods.
Novelty detection is the unsupervised problem of identifying anomalies in test data which significantly differ from the training set. Novelty detection is one of the classic challenges in Machine Learning and a core component of several research areas such as fraud detection, intrusion detection, medical diagnosis, data cleaning, and fault prevention. While numerous algorithms were designed to address this problem, most methods are only suitable to model continuous numerical data. Tackling datasets composed of mixed-type features, such as numerical and categorical data, or temporal datasets describing discrete event sequences is a challenging task. In addition to the supported data types, the key criteria for efficient novelty detection methods are the ability to accurately dissociate novelties from nominal samples, the interpretability, the scalability and the robustness to anomalies located in the training data. In this thesis, we investigate novel ways to tackle these issues. In particular, we propose (i) an experimental comparison of novelty detection methods for mixed-type data (ii) an experimental comparison of novelty detection methods for sequence data, (iii) a probabilistic nonparametric novelty detection method for mixed-type data based on Dirichlet process mixtures and exponential-family distributions and (iv) an autoencoder-based novelty detection model with encoder/decoder modelled as deep Gaussian processes.
Anomaly detection is often considered a challenging field of machine learning due to the difficulty of obtaining anomalous samples for training and the need to obtain a sufficient amount of training data. In recent years, autoencoders have been shown to be effective anomaly detectors that train only on "normal" data. Generative adversarial networks (GANs) have been used to generate additional training samples for classifiers, thus making them more accurate and robust. However, in anomaly detection GANs are only used to reconstruct existing samples rather than to generate additional ones. This stems both from the small amount and lack of diversity of anomalous data in most domains. In this study we propose MDGAN, a novel GAN architecture for improving anomaly detection through the generation of additional samples. Our approach uses two discriminators: a dense network for determining whether the generated samples are of sufficient quality (i.e., valid) and an autoencoder that serves as an anomaly detector. MDGAN enables us to reconcile two conflicting goals: 1) generate high-quality samples that can fool the first discriminator, and 2) generate samples that can eventually be effectively reconstructed by the second discriminator, thus improving its performance. Empirical evaluation on a diverse set of datasets demonstrates the merits of our approach.