Anomaly detection in supercomputers is a very difficult problem due to the big scale of the systems and the high number of components. The current state of the art for automated anomaly detection employs Machine Learning methods or statistical regression models in a supervised fashion, meaning that the detection tool is trained to distinguish among a fixed set of behaviour classes (healthy and unhealthy states). We propose a novel approach for anomaly detection in High Performance Computing systems based on a Machine (Deep) Learning technique, namely a type of neural network called autoencoder. The key idea is to train a set of autoencoders to learn the normal (healthy) behaviour of the supercomputer nodes and, after training, use them to identify abnormal conditions. This is different from previous approaches which where based on learning the abnormal condition, for which there are much smaller datasets (since it is very hard to identify them to begin with). We test our approach on a real supercomputer equipped with a fine-grained, scalable monitoring infrastructure that can provide large amount of data to characterize the system behaviour. The results are extremely promising: after the training phase to learn the normal system behaviour, our method is capable of detecting anomalies that have never been seen before with a very good accuracy (values ranging between 88% and 96%).
We present a novel unsupervised deep learning approach that utilizes the encoder-decoder architecture for detecting anomalies in sequential sensor data collected during industrial manufacturing. Our approach is designed not only to detect whether there exists an anomaly at a given time step, but also to predict what will happen next in the (sequential) process. We demonstrate our approach on a dataset collected from a real-world testbed. The dataset contains images collected under both normal conditions and synthetic anomalies. We show that the encoder-decoder model is able to identify the injected anomalies in a modern manufacturing process in an unsupervised fashion. In addition, it also gives hints about the temperature non-uniformity of the testbed during manufacturing, which is what we are not aware of before doing the experiment.
Anomaly detection aims to recognize samples with anomalous and unusual patterns with respect to a set of normal data, which is significant for numerous domain applications, e.g. in industrial inspection, medical imaging, and security enforcement. There are two key research challenges associated with existing anomaly detention approaches: (1) many of them perform well on low-dimensional problems however the performance on high-dimensional instances is limited, such as images; (2) many of them depend on often still rely on traditional supervised approaches and manual engineering of features, while the topic has not been fully explored yet using modern deep learning approaches, even when the well-label samples are limited. In this paper, we propose a One-for-all Image Anomaly Detection system (OIAD) based on disentangled learning using only clean samples. Our key insight is that the impact of small perturbation on the latent representation can be bounded for normal samples while anomaly images are usually outside such bounded intervals, called structure consistency. We implement this idea and evaluate its performance for anomaly detention. Our experiments with three datasets show that OIAD can detect over $90\%$ of anomalies while maintaining a high low false alarm rate. It can also detect suspicious samples from samples labeled as clean, coincided with what humans would deem unusual.
In order to support stable web-based applications and services, anomalies on the IT performance status have to be detected timely. Moreover, the performance trend across the time series should be predicted. In this paper, we propose SeqVL (Sequential VAE-LSTM), a neural network model based on both VAE (Variational Auto-Encoder) and LSTM (Long Short-Term Memory). This work is the first attempt to integrate unsupervised anomaly detection and trend prediction under one framework. Moreover, this model performs considerably better on detection and prediction than VAE and LSTM work alone. On unsupervised anomaly detection, SeqVL achieves competitive experimental results compared with other state-of-the-art methods on public datasets. On trend prediction, SeqVL outperforms several classic time series prediction models in the experiments of the public dataset.
The emergence of continuous health monitoring and the availability of an enormous amount of time series data has provided a great opportunity for the advancement of personal health tracking. In recent years, unsupervised learning methods have drawn special attention of researchers to tackle the sparse annotation of health data and real-time detection of anomalies has been a central problem of interest. However, one problem that has not been well addressed before is the early prediction of forthcoming negative health events. Early signs of an event can introduce subtle and gradual changes in the health signal prior to its onset, detection of which can be invaluable in effective prevention. In this study, we first demonstrate our observations on the shortcoming of widely adopted anomaly detection methods in uncovering the changes prior to a negative health event. We then propose a framework which relies on online clustering of signal segment representations which are automatically learned by a specially designed LSTM auto-encoder. We show the effectiveness of our approach by predicting Bradycardia events in infants using MIT-PICS dataset 1.3 minutes ahead of time with 68\% AUC score on average, using no label supervision. Results of our study can indicate the viability of our approach in the early detection of health events in other applications as well.