Diagnosis
Statistical Batch-Based Bearing Fault Detection
Jorry, Victoria, Duma, Zina-Sabrina, Sihvonen, Tuomas, Reinikainen, Satu-Pia, Roininen, Lassi
In the domain of rotating machinery, bearings are vulnerable to different mechanical faults, including ball, inner, and outer race faults. Various techniques can be used in condition-based monitoring, from classical signal analysis to deep learning methods. Based on the complex working conditions of rotary machines, multivariate statistical process control charts such as Hotelling's $T^2$ and Squared Prediction Error are useful for providing early warnings. However, these methods are rarely applied to condition monitoring of rotating machinery due to the univariate nature of the datasets. In the present paper, we propose a multivariate statistical process control-based fault detection method that utilizes multivariate data composed of Fourier transform features extracted for fixed-time batches. Our approach makes use of the multidimensional nature of Fourier transform characteristics, which record more detailed information about the machine's status, in an effort to enhance early defect detection and diagnosis. Experiments with varying vibration measurement locations (Fan End, Drive End), fault types (ball, inner, and outer race faults), and motor loads (0-3 horsepower) are used to validate the suggested approach. The outcomes illustrate our method's effectiveness in fault detection and point to possible broader uses in industrial maintenance.
EverAdapt: Continuous Adaptation for Dynamic Machine Fault Diagnosis Environments
Edward, null, Ragab, Mohamed, Xu, Yuecong, Wu, Min, Xu, Yuecong, Chen, Zhenghua, Alseiari, Abdulla, Li, Xiaoli
Unsupervised Domain Adaptation (UDA) has emerged as a key solution in data-driven fault diagnosis, addressing domain shift where models underperform in changing environments. However, under the realm of continually changing environments, UDA tends to underperform on previously seen domains when adapting to new ones - a problem known as catastrophic forgetting. To address this limitation, we introduce the EverAdapt framework, specifically designed for continuous model adaptation in dynamic environments. Central to EverAdapt is a novel Continual Batch Normalization (CBN), which leverages source domain statistics as a reference point to standardize feature representations across domains. EverAdapt not only retains statistical information from previous domains but also adapts effectively to new scenarios. Complementing CBN, we design a class-conditional domain alignment module for effective integration of target domains, and a Sample-efficient Replay strategy to reinforce memory retention. Experiments on real-world datasets demonstrate EverAdapt superiority in maintaining robust fault diagnosis in dynamic environments. Our code is available: https://github.com/mohamedr002/EverAdapt
AI-Enhanced 7-Point Checklist for Melanoma Detection Using Clinical Knowledge Graphs and Data-Driven Quantification
Wang, Yuheng, Yu, Tianze, Cai, Jiayue, Kalia, Sunil, Lui, Harvey, Wang, Z. Jane, Lee, Tim K.
The 7-point checklist (7PCL) is widely used in dermoscopy to identify malignant melanoma lesions needing urgent medical attention. It assigns point values to seven attributes: major attributes are worth two points each, and minor ones are worth one point each. A total score of three or higher prompts further evaluation, often including a biopsy. However, a significant limitation of current methods is the uniform weighting of attributes, which leads to imprecision and neglects their interconnections. Previous deep learning studies have treated the prediction of each attribute with the same importance as predicting melanoma, which fails to recognize the clinical significance of the attributes for melanoma. To address these limitations, we introduce a novel diagnostic method that integrates two innovative elements: a Clinical Knowledge-Based Topological Graph (CKTG) and a Gradient Diagnostic Strategy with Data-Driven Weighting Standards (GD-DDW). The CKTG integrates 7PCL attributes with diagnostic information, revealing both internal and external associations. By employing adaptive receptive domains and weighted edges, we establish connections among melanoma's relevant features. Concurrently, GD-DDW emulates dermatologists' diagnostic processes, who first observe the visual characteristics associated with melanoma and then make predictions. Our model uses two imaging modalities for the same lesion, ensuring comprehensive feature acquisition. Our method shows outstanding performance in predicting malignant melanoma and its features, achieving an average AUC value of 85%. This was validated on the EDRA dataset, the largest publicly available dataset for the 7-point checklist algorithm. Specifically, the integrated weighting system can provide clinicians with valuable data-driven benchmarks for their evaluations.
On the Parameter Identifiability of Partially Observed Linear Causal Models
Dong, Xinshuai, Ng, Ignavier, Huang, Biwei, Sun, Yuewen, Jin, Songyao, Legaspi, Roberto, Spirtes, Peter, Zhang, Kun
Linear causal models are important tools for modeling causal dependencies and yet in practice, only a subset of the variables can be observed. In this paper, we examine the parameter identifiability of these models by investigating whether the edge coefficients can be recovered given the causal structure and partially observed data. Our setting is more general than that of prior research - we allow all variables, including both observed and latent ones, to be flexibly related, and we consider the coefficients of all edges, whereas most existing works focus only on the edges between observed variables. Theoretically, we identify three types of indeterminacy for the parameters in partially observed linear causal models. We then provide graphical conditions that are sufficient for all parameters to be identifiable and show that some of them are provably necessary. Methodologically, we propose a novel likelihood-based parameter estimation method that addresses the variance indeterminacy of latent variables in a specific way and can asymptotically recover the underlying parameters up to trivial indeterminacy. Empirical studies on both synthetic and real-world datasets validate our identifiability theory and the effectiveness of the proposed method in the finite-sample regime.
Development of Multistage Machine Learning Classifier using Decision Trees and Boosting Algorithms over Darknet Network Traffic
Nair, Anjali Sureshkumar, Nitnaware, Prashant
In recent years, the clandestine nature of darknet activities has presented an escalating challenge to cybersecurity efforts, necessitating sophisticated methods for the detection and classification of network traffic associated with these covert operations. The system addresses the significant challenge of class imbalance within Darknet traffic datasets, where malicious traffic constitutes a minority, hindering effective discrimination between normal and malicious behavior. By leveraging boosting algorithms like AdaBoost and Gradient Boosting coupled with decision trees, this study proposes a robust solution for network traffic classification. Boosting algorithms ensemble learning corrects errors iteratively and assigns higher weights to minority class instances, complemented by the hierarchical structure of decision trees. The additional Feature Selection which is a preprocessing method by utilizing Information Gain metrics, Fisher's Score, and Chi-Square test selection for features is employed. Rigorous experimentation with diverse Darknet traffic datasets validates the efficacy of the proposed multistage classifier, evaluated through various performance metrics such as accuracy, precision, recall, and F1-score, offering a comprehensive solution for accurate detection and classification of Darknet activities.
Double Gradient Reversal Network for Single-Source Domain Generalization in Multi-mode Fault Diagnosis
Li, Guangqiang, Atoui, M. Amine, Li, Xiangshun
Domain generalization achieves fault diagnosis on unseen modes. In process industrial systems, fault samples are limited, and only single-mode fault data can be obtained. Extracting domain-invariant fault features from single-mode data for unseen mode fault diagnosis poses challenges. Existing methods utilize a generator module to simulate samples of unseen modes. However, multi-mode samples contain complex spatiotemporal information, which brings significant difficulties to accurate sample generation. Therefore, double gradient reversal network (DGRN) is proposed. First, the model is pre-trained to acquire fault knowledge from the single seen mode. Then, pseudo-fault feature generation strategy is designed by Adaptive instance normalization, to simulate fault features of unseen mode. The dual adversarial training strategy is created to enhance the diversity of pseudo-fault features, which models unseen modes with significant distribution differences. Subsequently, domain-invariant feature extraction strategy is constructed by contrastive learning and adversarial learning. This strategy extracts common features of faults and helps multi-mode fault diagnosis. Finally, the experiments were conducted on Tennessee Eastman process and continuous stirred-tank reactor. The experiments demonstrate that DGRN achieves high classification accuracy on unseen modes while maintaining a small model size.
Latent Causal Probing: A Formal Perspective on Probing with Causal Models of Data
As language models (LMs) deliver increasing performance on a range of NLP tasks, probing classifiers have become an indispensable technique in the effort to better understand their inner workings. A typical setup involves (1) defining an auxiliary task consisting of a dataset of text annotated with labels, then (2) supervising small classifiers to predict the labels from the representations of a pretrained LM as it processed the dataset. A high probing accuracy is interpreted as evidence that the LM has learned to perform the auxiliary task as an unsupervised byproduct of its original pretraining objective. Despite the widespread usage of probes, however, the robust design and analysis of probing experiments remains a challenge. We develop a formal perspective on probing using structural causal models (SCM). Specifically, given an SCM which explains the distribution of tokens observed during training, we frame the central hypothesis as whether the LM has learned to represent the latent variables of the SCM. Empirically, we extend a recent study of LMs in the context of a synthetic grid-world navigation task, where having an exact model of the underlying causal structure allows us to draw strong inferences from the result of probing experiments. Our techniques provide robust empirical evidence for the ability of LMs to learn the latent causal concepts underlying text.
On the Complexity of Identification in Linear Structural Causal Models
Dörfler, Julian, van der Zander, Benito, Bläser, Markus, Liskiewicz, Maciej
Learning the unknown causal parameters of a linear structural causal model is a fundamental task in causal analysis. The task, known as the problem of identification, asks to estimate the parameters of the model from a combination of assumptions on the graphical structure of the model and observational data, represented as a non-causal covariance matrix. In this paper, we give a new sound and complete algorithm for generic identification which runs in polynomial space. By standard simulation results, this algorithm has exponential running time which vastly improves the state-of-the-art double exponential time method using a Gr\"obner basis approach. The paper also presents evidence that parameter identification is computationally hard in general. In particular, we prove, that the task asking whether, for a given feasible correlation matrix, there are exactly one or two or more parameter sets explaining the observed matrix, is hard for $\forall R$, the co-class of the existential theory of the reals. In particular, this problem is $coNP$-hard. To our best knowledge, this is the first hardness result for some notion of identifiability.
Thyroidiomics: An Automated Pipeline for Segmentation and Classification of Thyroid Pathologies from Scintigraphy Images
Sabouri, Maziar, Ahamed, Shadab, Asadzadeh, Azin, Avval, Atlas Haddadi, Bagheri, Soroush, Arabi, Mohsen, Zakavi, Seyed Rasoul, Askari, Emran, Rasouli, Ali, Aghaee, Atena, Sehati, Mohaddese, Yousefirizi, Fereshteh, Uribe, Carlos, Hajianfar, Ghasem, Zaidi, Habib, Rahmim, Arman
The objective of this study was to develop an automated pipeline that enhances thyroid disease classification using thyroid scintigraphy images, aiming to decrease assessment time and increase diagnostic accuracy. Anterior thyroid scintigraphy images from 2,643 patients were collected and categorized into diffuse goiter (DG), multinodal goiter (MNG), and thyroiditis (TH) based on clinical reports, and then segmented by an expert. A ResUNet model was trained to perform auto-segmentation. Radiomic features were extracted from both physician (scenario 1) and ResUNet segmentations (scenario 2), followed by omitting highly correlated features using Spearman's correlation, and feature selection using Recursive Feature Elimination (RFE) with XGBoost as the core. All models were trained under leave-one-center-out cross-validation (LOCOCV) scheme, where nine instances of algorithms were iteratively trained and validated on data from eight centers and tested on the ninth for both scenarios separately. Segmentation performance was assessed using the Dice similarity coefficient (DSC), while classification performance was assessed using metrics, such as precision, recall, F1-score, accuracy, area under the Receiver Operating Characteristic (ROC AUC), and area under the precision-recall curve (PRC AUC). ResUNet achieved DSC values of 0.84$\pm$0.03, 0.71$\pm$0.06, and 0.86$\pm$0.02 for MNG, TH, and DG, respectively. Classification in scenario 1 achieved an accuracy of 0.76$\pm$0.04 and a ROC AUC of 0.92$\pm$0.02 while in scenario 2, classification yielded an accuracy of 0.74$\pm$0.05 and a ROC AUC of 0.90$\pm$0.02. The automated pipeline demonstrated comparable performance to physician segmentations on several classification metrics across different classes, effectively reducing assessment time while maintaining high diagnostic accuracy. Code available at: https://github.com/ahxmeds/thyroidiomics.git.
CFaults: Model-Based Diagnosis for Fault Localization in C Programs with Multiple Test Cases
Orvalho, Pedro, Janota, Mikoláš, Manquinho, Vasco
Debugging is one of the most time-consuming and expensive tasks in software development. Several formula-based fault localization (FBFL) methods have been proposed, but they fail to guarantee a set of diagnoses across all failing tests or may produce redundant diagnoses that are not subset-minimal, particularly for programs with multiple faults. This paper introduces a novel fault localization approach for C programs with multiple faults. CFaults leverages Model-Based Diagnosis (MBD) with multiple observations and aggregates all failing test cases into a unified MaxSAT formula. Consequently, our method guarantees consistency across observations and simplifies the fault localization procedure. Experimental results on two benchmark sets of C programs, TCAS and C-Pack-IPAs, show that CFaults is faster than other FBFL approaches like BugAssist and SNIPER. Moreover, CFaults only generates subset-minimal diagnoses of faulty statements, whereas the other approaches tend to enumerate redundant diagnoses.