xmea
Combining SHAP and Causal Analysis for Interpretable Fault Detection in Industrial Processes
Santos, Pedro Cortes dos, Rocha, Matheus Becali, Krohling, Renato A
Industrial processes generate complex data that challenge fault detection systems, often yielding opaque or underwhelming results despite advanced machine learning techniques. This study tackles such difficulties using the Tennessee Eastman Process, a well-established benchmark known for its intricate dynamics, to develop an innovative fault detection framework. Initial attempts with standard models revealed limitations in both performance and interpretability, prompting a shift toward a more tractable approach. By employing SHAP (SHapley Additive exPlanations), we transform the problem into a more manageable and transparent form, pinpointing the most critical process features driving fault predictions. This reduction in complexity unlocks the ability to apply causal analysis through Directed Acyclic Graphs, generated by multiple algorithms, to uncover the underlying mechanisms of fault propagation. The resulting causal structures align strikingly with SHAP findings, consistently highlighting key process elements-like cooling and separation systems-as pivotal to fault development. Together, these methods not only enhance detection accuracy but also provide operators with clear, actionable insights into fault origins, a synergy that, to our knowledge, has not been previously explored in this context. This dual approach bridges predictive power with causal understanding, offering a robust tool for monitoring complex manufacturing environments and paving the way for smarter, more interpretable fault detection in industrial systems.
Fault Detection and Identification via Monitoring Modules Based on Clusters of Interacting Measurements
Villagomez, Enrique Luna, Mahalec, Vladimir
This work introduces a novel control-aware distributed process monitoring methodology based on modules comprised of clusters of interacting measurements. The methodology relies on the process flow diagram (PFD) and control system structure without requiring cross-correlation data to create monitoring modules. The methodology is validated on the Tennessee Eastman Process benchmark using full Principal Component Analysis (f-PCA) in the monitoring modules. The results are comparable to nonlinear techniques implemented in a centralized manner such as Kernel PCA (KPCA), Autoencoders (AE), and Recurrent Neural Networks (RNN), or distributed techniques like the Distributed Canonical Correlation Analysis (DCCA). Temporal plots of fault detection by different modules show clearly the magnitude and propagation of the fault through each module, pinpointing the module where the fault originates, and separating controllable faults from other faults. This information, combined with PCA contribution plots, helps detection and identification as effectively as more complex nonlinear centralized or distributed methods.
SensorSCAN: Self-Supervised Learning and Deep Clustering for Fault Diagnosis in Chemical Processes
Golyadkin, Maksim, Pozdnyakov, Vitaliy, Zhukov, Leonid, Makarov, Ilya
Modern industrial facilities generate large volumes of raw sensor data during the production process. This data is used to monitor and control the processes and can be analyzed to detect and predict process abnormalities. Typically, the data has to be annotated by experts in order to be used in predictive modeling. However, manual annotation of large amounts of data can be difficult in industrial settings. In this paper, we propose SensorSCAN, a novel method for unsupervised fault detection and diagnosis, designed for industrial chemical process monitoring. We demonstrate our model's performance on two publicly available datasets of the Tennessee Eastman Process with various faults. The results show that our method significantly outperforms existing approaches (+0.2-0.3 TPR for a fixed FPR) and effectively detects most of the process faults without expert annotation. Moreover, we show that the model fine-tuned on a small fraction of labeled data nearly reaches the performance of a SOTA model trained on the full dataset. We also demonstrate that our method is suitable for real-world applications where the number of faults is not known in advance. The code is available at https://github.com/AIRI-Institute/sensorscan.
CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods
Menegozzo, Giovanni, Dall'Alba, Diego, Fiorini, Paolo
Causal relationships are commonly examined in manufacturing processes to support faults investigations, perform interventions, and make strategic decisions. Industry 4.0 has made available an increasing amount of data that enable data-driven Causal Discovery (CD). Considering the growing number of recently proposed CD methods, it is necessary to introduce strict benchmarking procedures on publicly available datasets since they represent the foundation for a fair comparison and validation of different methods. This work introduces two novel public datasets for CD in continuous manufacturing processes. The first dataset employs the well-known Tennessee Eastman simulator for fault detection and process control. The second dataset is extracted from an ultra-processed food manufacturing plant, and it includes a description of the plant, as well as multiple ground truths. These datasets are used to propose a benchmarking procedure based on different metrics and evaluated on a wide selection of CD algorithms. This work allows testing CD methods in realistic conditions enabling the selection of the most suitable method for specific target applications. The datasets are available at the following link: https://github.com/giovanniMen
Explainability: Relevance based Dynamic Deep Learning Algorithm for Fault Detection and Diagnosis in Chemical Processes
Agarwal, Piyush, Tamer, Melih, Budman, Hector
The focus of this work is on Statistical Process Control (SPC) of a manufacturing process based on available measurements. Two important applications of SPC in industrial settings are fault detection and diagnosis (FDD). In this work a deep learning (DL) based methodology is proposed for FDD. We investigate the application of an explainability concept to enhance the FDD accuracy of a deep neural network model trained with a data set of relatively small number of samples. The explainability is quantified by a novel relevance measure of input variables that is calculated from a Layerwise Relevance Propagation (LRP) algorithm. It is shown that the relevances can be used to discard redundant input feature vectors/ variables iteratively thus resulting in reduced over-fitting of noisy data, increasing distinguishability between output classes and superior FDD test accuracy. The efficacy of the proposed method is demonstrated on the benchmark Tennessee Eastman Process.
Hierarchical Deep Recurrent Neural Network based Method for Fault Detection and Diagnosis
Agarwal, Piyush, Gonzalez, Jorge Ivan Mireles, Elkamel, Ali, Budman, Hector
A Deep Neural Network (DNN) based algorithm is proposed for the detection and classification of faults in industrial plants. The proposed algorithm has the ability to classify faults, especially incipient faults that are difficult to detect and diagnose with traditional threshold based statistical methods or by conventional Artificial Neural Networks (ANNs). The algorithm is based on a Supervised Deep Recurrent Autoencoder Neural Network (Supervised DRAE-NN) that uses dynamic information of the process along the time horizon. Based on this network a hierarchical structure is formulated by grouping faults based on their similarity into subsets of faults for detection and diagnosis. Further, an external pseudo-random binary signal (PRBS) is designed and injected into the system to identify incipient faults. The hierarchical structure based strategy improves the detection and classification accuracy significantly for both incipient and non-incipient faults. The proposed approach is tested on the benchmark Tennessee Eastman Process resulting in significant improvements in classification as compared to both multivariate linear model-based strategies and non-hierarchical nonlinear model-based strategies.