Given high-dimensional time series data (e.g., sensor data), how can we detect anomalous events, such as system faults and attacks? More challengingly, how can we do this in a way that captures complex inter-sensor relationships, and detects and explains anomalies which deviate from these relationships? Recently, deep learning approaches have enabled improvements in anomaly detection in high-dimensional datasets; however, existing methods do not explicitly learn the structure of existing relationships between variables, or use them to predict the expected behavior of time series. Our approach combines a structure learning approach with graph neural networks, additionally using attention weights to provide explainability for the detected anomalies. Experiments on two real-world sensor datasets with ground truth anomalies show that our method detects anomalies more accurately than baseline approaches, accurately captures correlations between sensors, and allows users to deduce the root cause of a detected anomaly.
Jia, Yifan, Wang, Jingyi, Poskitt, Christopher M., Chattopadhyay, Sudipta, Sun, Jun, Chen, Yuqi
The threats faced by cyber-physical systems (CPSs) in critical infrastructure have motivated research into a multitude of attack detection mechanisms, including anomaly detectors based on neural network models. The effectiveness of anomaly detectors can be assessed by subjecting them to test suites of attacks, but less consideration has been given to adversarial attackers that craft noise specifically designed to deceive them. While successfully applied in domains such as images and audio, adversarial attacks are much harder to implement in CPSs due to the presence of other built-in defence mechanisms such as rule checkers(or invariant checkers). In this work, we present an adversarial attack that simultaneously evades the anomaly detectors and rule checkers of a CPS. Inspired by existing gradient-based approaches, our adversarial attack crafts noise over the sensor and actuator values, then uses a genetic algorithm to optimise the latter, ensuring that the neural network and the rule checking system are both deceived.We implemented our approach for two real-world critical infrastructure testbeds, successfully reducing the classification accuracy of their detectors by over 50% on average, while simultaneously avoiding detection by rule checkers. Finally, we explore whether these attacks can be mitigated by training the detectors on adversarial samples.
Garmaroodi, Mohammad Sadegh Sadeghi, Farivar, Faezeh, Haghighi, Mohammad Sayad, Shoorehdeli, Mahdi Aliyari, Jolfaei, Alireza
Industry 4.0 will make manufacturing processes smarter but this smartness requires more environmental awareness, which in case of Industrial Internet of Things, is realized by the help of sensors. This article is about industrial pharmaceutical systems and more specifically, water purification systems. Purified water which has certain conductivity is an important ingredient in many pharmaceutical products. Almost every pharmaceutical company has a water purifying unit as a part of its interdependent systems. Early detection of faults right at the edge can significantly decrease maintenance costs and improve safety and output quality, and as a result, lead to the production of better medicines. In this paper, with the help of a few sensors and data mining approaches, an anomaly detection system is built for CHRIST Osmotron water purifier. This is a practical research with real-world data collected from SinaDarou Labs Co. Data collection was done by using six sensors over two-week intervals before and after system overhaul. This gave us normal and faulty operation samples. Given the data, we propose two anomaly detection approaches to build up our edge fault detection system. The first approach is based on supervised learning and data mining e.g. by support vector machines. However, since we cannot collect all possible faults data, an anomaly detection approach is proposed based on normal system identification which models the system components by artificial neural networks. Extensive experiments are conducted with the dataset generated in this study to show the accuracy of the data-driven and model-based anomaly detection methods.
Often these processes result in highly dimensional data sets, with complex relationships within the data and exhibit stochastic behavior. Furthermore the anomalies by definition contain high self-information measure and therefore carry useful information about the underlying data generation process. There exist a number of similar definitions of what an anomaly is however in this paper the following definition is adopted : 1. Anomalies are different from the norm in respect to their attributes.
Li, Dan, Chen, Dacheng, Shi, Lei, Jin, Baihong, Goh, Jonathan, Ng, See-Kiong
The prevalence of networked sensors and actuators in many real-world systems such as smart buildings, factories, power plants, and data centers generate substantial amounts of multivariate time series data for these systems. The rich sensor data can be continuously monitored for intrusion events through anomaly detection. However, conventional threshold-based anomaly detection methods are inadequate due to the dynamic complexities of these systems, while supervised machine learning methods are unable to exploit the large amounts of data due to the lack of labeled data. On the other hand, current unsupervised machine learning approaches have not fully exploited the spatial-temporal correlation and other dependencies amongst the multiple variables (sensors/actuators) in the system for detecting anomalies. In this work, we propose an unsupervised multivariate anomaly detection method based on Generative Adversarial Networks (GANs). Instead of treating each data stream independently, our proposed MAD-GAN framework considers the entire variable set concurrently to capture the latent interactions amongst the variables. We also fully exploit both the generator and discriminator produced by the GAN, using a novel anomaly score called DR-score to detect anomalies by discrimination and reconstruction. We have tested our proposed MAD-GAN using two recent datasets collected from real-world CPS: the Secure Water Treatment (SWaT) and the Water Distribution (WADI) datasets. Our experimental results showed that the proposed MAD-GAN is effective in reporting anomalies caused by various cyber-intrusions compared in these complex real-world systems.
Li, Dan, Chen, Dacheng, Goh, Jonathan, Ng, See-kiong
Today's Cyber-Physical Systems (CPSs) are large, complex, and affixed with networked sensors and actuators that are targets for cyber-attacks. Conventional detection techniques are unable to deal with the increasingly dynamic and complex nature of the CPSs. On the other hand, the networked sensors and actuators generate large amounts of data streams that can be continuously monitored for intrusion events. Unsupervised machine learning techniques can be used to model the system behaviour and classify deviant behaviours as possible attacks. In this work, we proposed a novel Generative Adversarial Networks-based Anomaly Detection (GAN-AD) method for such complex networked CPSs. We used LSTM-RNN in our GAN to capture the distribution of the multivariate time series of the sensors and actuators under normal working conditions of a CPS. Instead of treating each sensor's and actuator's time series independently, we model the time series of multiple sensors and actuators in the CPS concurrently to take into account of potential latent interactions between them. To exploit both the generator and the discriminator of our GAN, we deployed the GAN-trained discriminator together with the residuals between generator-reconstructed data and the actual samples to detect possible anomalies in the complex CPS. We used our GAN-AD to distinguish abnormal attacked situations from normal working conditions for a complex six-stage Secure Water Treatment (SWaT) system. Experimental results showed that the proposed strategy is effective in identifying anomalies caused by various attacks with high detection rate and low false positive rate as compared to existing methods.