Holst, Anders (Swedish Institute of Computer Science) | Bohlin, Markus (Swedish Institute of Computer Science) | Ekman, Jan (Swedish Institute of Computer Science) | Sellin, Ola (Bombardier Transportation) | Lindström, Björn (Addiva Consulting AB) | Larsen, Stefan (Addiva Eduro AB)
We have developed a method for statistical anomaly detection which has been deployed in a tool for condition monitoring of train fleets. The tool is currently used by several railway operators over the world to inspect and visualize the occurrence of event messages generated on the trains. The anomaly detection component helps the operators to quickly find significant deviations from normal behavior and to detect early indications for possible problems. The savings in maintenance costs comes mainly from avoiding costly breakdowns, and have been estimated to several million Euros per year for the tool. In the long run, it is expected that maintenance costs can be reduced with between 5 and 10 % by using the tool.
Anomalies (unusual patterns) in time-series data give essential, and often actionable information in critical situations. Examples can be found in such fields as healthcare, intrusion detection, finance, security and flight safety. In this paper we propose new conformalized density- and distance-based anomaly detection algorithms for a one-dimensional time-series data. The algorithms use a combination of a feature extraction method, an approach to assess a score whether a new observation differs significantly from a previously observed data, and a probabilistic interpretation of this score based on the conformal paradigm.
At Statsbot, we're constantly reviewing the landscape of anomaly detection approaches and refinishing our models based on this research. This article is an overview of the most popular anomaly detection algorithms for time series and their pros and cons. This post is dedicated to non-experienced readers who just want to get a sense of the current state of anomaly detection techniques. Not wanting to scare you with mathematical models, we hid all the math under referral links. Anomaly detection problem for time series is usually formulated as finding outlier data points relative to some standard or usual signal.
Learning minimum volume sets of an underlying nominal distribution is a very effective approach to anomaly detection. Several approaches to learning minimum volume sets have been proposed in the literature, including the K-point nearest neighbor graph (K-kNNG) algorithm based on the geometric entropy minimization (GEM) principle . The K-kNNG detector, while possessing several desirable characteristics, suffers from high computation complexity, and in  a simpler heuristic approximation, the leave-one-out kNNG (L1O-kNNG) was proposed. In this paper, we propose a novel bipartite k-nearest neighbor graph (BP-kNNG) anomaly detection scheme for estimating minimum volume sets. Our bipartite estimator retains all the desirable theoretical properties of the K-kNNG, while being computationally simpler than the K-kNNG and the surrogate L1O-kNNG detectors.
The pyISC is a Python API and extension to the C++ based Incremental Stream Clustering (ISC) anomaly detection and classification framework. The framework is based on parametric Bayesian statistical inference using the Bayesian Principal Anomaly (BPA), which enables to combine the output from several probability distributions. pyISC is designed to be easy to use and integrated with other Python libraries, specifically those used for data science. In this paper, we show how to use the framework and we also compare its performance to other well-known methods on 22 real-world datasets. The simulation results show that the performance of pyISC is comparable to the other methods. pyISC is part of the Stream toolbox developed within the STREAM project.