Recent developments in SCADA (Supervisory Control and Data Acquisition) systems for physical infrastructure, such as high pressure gas pipeline systems and electric grids, have generated enormous amounts of time series data. This data brings great opportunities for advanced knowledge discovery and data mining methods to identify system failures faster and earlier than operation experts. This paper presents our effort in collaboration with a utility company to solve a grand challenge; namely, to use advanced data mining methods to detect leaks on a high pressure gas transmission system. Leak detection models with unsupervised learning tasks were developed analyzing billions of data records to identify leaks of different sizes and impacts, with very low false positive rates. In particular, our solution was able to identify small leaks leading to rupture events. The model also identified small leaks not identifiable with current detection systems. Such high-fidelity early identification enables operation personnel to take preventive measures against possible catastrophic events. We then formulate several generic detection methods with models derived from time series anomaly detection methods. We show that our leak detection models are superior to the SCADA alarm system, a mass balance model and other generic time series anomaly detection models in terms of both detection accuracy and computation time.
In the past years, industrial networks have become increasingly interconnected and opened to private or public networks. This leads to an increase in efficiency and manageability, but also increases the attack surface. Industrial networks often consist of legacy systems that have not been designed with security in mind. In the last decade, an increase in attacks on cyber-physical systems was observed, with drastic consequences on the physical work. In this work, attack vectors on industrial networks are categorised. A real-world process is simulated, attacks are then introduced. Finally, two machine learning-based methods for time series anomaly detection are employed to detect the attacks. Matrix Profiles are employed more successfully than a predictor Long Short-Term Memory network, a class of neural networks.
The concept of Industry 4.0 brings a disruption into the processing industry. It is characterised by a high degree of intercommunication, embedded computation, resulting in a decentralised and distributed handling of data. Additionally, cloud-storage and Software-as-a-Service (SaaS) approaches enhance a centralised storage and handling of data. This often takes place in third-party networks. Furthermore, Industry 4.0 is driven by novel business cases. Lot sizes of one, customer individual production, observation of process state and progress in real-time and remote maintenance, just to name a few. All of these new business cases make use of the novel technologies. However, cyber security has not been an issue in industry. Industrial networks have been considered physically separated from public networks. Additionally, the high level of uniqueness of any industrial network was said to prevent attackers from exploiting flaws. Those assumptions are inherently broken by the concept of Industry 4.0. As a result, an abundance of attack vectors is created. In the past, attackers have used those attack vectors in spectacular fashions. Especially Small and Mediumsized Enterprises (SMEs) in Germany struggle to adapt to these challenges. Reasons are the cost required for technical solutions and security professionals. In order to enable SMEs to cope with the growing threat in the cyberspace, the research project IUNO Insec aims at providing and improving security solutions that can be used without specialised security knowledge. The project IUNO Insec is briefly introduced in this work. Furthermore, contributions in the field of intrusion detection, especially machine learning-based solutions, for industrial environments provided by the authors are presented and set into context.
The Industrial Internet of Things drastically increases connectivity of devices in industrial applications. In addition to the benefits in efficiency, scalability and ease of use, this creates novel attack surfaces. Historically, industrial networks and protocols do not contain means of security, such as authentication and encryption, that are made necessary by this development. Thus, industrial IT-security is needed. In this work, emulated industrial network data is transformed into a time series and analysed with three different algorithms. The data contains labeled attacks, so the performance can be evaluated. Matrix Profiles perform well with almost no parameterisation needed. Seasonal Autoregressive Integrated Moving Average performs well in the presence of noise, requiring parameterisation effort. Long Short Term Memory-based neural networks perform mediocre while requiring a high training- and parameterisation effort.
The cost of wind energy can be reduced by using SCADA data to detect faults in wind turbine components. Normal behavior models are one of the main fault detection approaches, but there is a lack of consensus in how different input features affect the results. In this work, a new taxonomy based on the causal relations between the input features and the target is presented. Based on this taxonomy, the impact of different input feature configurations on the modelling and fault detection performance is evaluated. To this end, a framework that formulates the detection of faults as a classification problem is also presented.