Performance Analysis
Automated Machine Learning: A Case Study on Non-Intrusive Appliance Load Monitoring
Moin, Armin, Wattanavaekin, Ukrit, Lungu, Alexandra, Rössler, Stephan, Günnemann, Stephan
We propose a novel approach to enable Automated Machine Learning (AutoML) for Non-Intrusive Appliance Load Monitoring (NIALM), also known as Energy Disaggregation, through Bayesian Optimization. NIALM offers a cost-effective alternative to smart meters for measuring the energy consumption of electric devices and appliances. NIALM methods analyze the entire power consumption signal of a household and predict the type of appliances as well as their individual power consumption (i.e., their contributions to the aggregated signal). We enable NIALM domain experts and practitioners who typically have no deep data analytics or Machine Learning (ML) skills to benefit from state-of-the-art ML approaches to NIALM. Further, we conduct a survey and benchmarking of the state of the art and show that in many cases, simple and basic ML models and algorithms, such as Decision Trees, outperform the state of the art. Finally, we present our open-source tool, AutoML4NIALM, which will facilitate the exploitation of existing methods for NIALM in the industry.
Enhanced semi-supervised stamping process monitoring with physically-informed feature extraction
In tackling frequent batch anomalies in high-speed stamping processes, this study introduces a novel semi-supervised in-process anomaly monitoring framework, utilizing accelerometer signals and physics information, to capture the process anomaly effectively. The proposed framework facilitates the construction of a monitoring model with imbalanced sample distribution, which enables in-process condition monitoring in real-time to prevent batch anomalies, which helps to reduce batch defects risk and enhance production yield. Firstly, to effectively capture key features from raw data containing redundant information, a hybrid feature extraction algorithm is proposed to utilize data-driven methods and physical mechanisms simultaneously. Secondly, to address the challenge brought by imbalanced sample distribution, a semi-supervised anomaly detection model is established, which merely employs normal samples to build a golden baseline model, and a novel deviation score is proposed to quantify the anomaly level of each online stamping stroke. The effectiveness of the proposed feature extraction method is validated with various classification algorithms. A real-world in-process dataset from stamping manufacturing workshop is employed to illustrate the superiority of proposed semi-supervised framework with enhance performance for process anomaly monitoring.
Leveraging Multi-Task Learning for Multi-Label Power System Security Assessment
Za'ter, Muhy Eddin, Sajad, Amir, Hodge, Bri-Mathias
--This paper introduces a novel approach to the power system security assessment using Multi-T ask Learning (MTL), and reformulating the problem as a multi-label classification task. The proposed MTL framework simultaneously assesses static, voltage, transient, and small-signal stability, improving both accuracy and interpretability with respect to the most state of the art machine learning methods. It consists of a shared encoder and multiple decoders, enabling knowledge transfer between stability tasks. Experiments on the IEEE 68-bus system demonstrate a measurable superior performance of the proposed method compared to the extant state-of-the-art approaches. The power system security assessment (PSSA) is essential power application in energy management systems [1] apparatus that ensures the reliability and stability of energy delivery [2]. Power system operators routinely perform security assessments to ensure the system can withstand disturbances, typically involving steady-state and dynamic simulations every 15 minutes to prepare contingency plans for critical scenarios [3]. In recent years, mainly due to the ongoing changing landscape in the energy mix of electricity grids around the globe, conducting real-time PSSA has become more complex to the point that many power utilities may abandon this critical function. Instead, they rely solely on static security assessment, risking blackout as a result of dynamic instabilities.
Estimating Quality in Therapeutic Conversations: A Multi-Dimensional Natural Language Processing Framework
Rueda, Alice, Perivolaris, Argyrios, Roy, Niloy, Weston, Dylan, Shaya, Sarmed, Cote, Zachary, Ivanov, Martin, Teferra, Bazen G., Wu, Yuqi, Rambhatla, Sirisha, Sharma, Divya, Greenshaw, Andrew, Jetly, Rakesh, Zhang, Yanbo, Cao, Bo, Samavi, Reza, Krishnan, Sridhar, Bhat, Venkat
Engagement between client and therapist is a critical determinant of therapeutic success. We propose a multi-dimensional natural language processing (NLP) framework that objectively classifies engagement quality in counseling sessions based on textual transcripts. Using 253 motivational interviewing transcripts (150 high-quality, 103 low-quality), we extracted 42 features across four domains: conversational dynamics, semantic similarity as topic alignment, sentiment classification, and question detection. Classifiers, including Random Forest (RF), Cat-Boost, and Support Vector Machines (SVM), were hyperparameter tuned and trained using a stratified 5-fold cross-validation and evaluated on a holdout test set. On balanced (non-augmented) data, RF achieved the highest classification accuracy (76.7%), and SVM achieved the highest AUC (85.4%). After SMOTE-Tomek augmentation, performance improved significantly: RF achieved up to 88.9% accuracy, 90.0% F1-score, and 94.6% AUC, while SVM reached 81.1% accuracy, 83.1% F1-score, and 93.6% AUC. The augmented data results reflect the potential of the framework in future larger-scale applications. Feature contribution revealed conversational dynamics and semantic similarity between clients and therapists were among the top contributors, led by words uttered by the client (mean and standard deviation). The framework was robust across the original and augmented datasets and demonstrated consistent improvements in F1 scores and recall. While currently text-based, the framework supports future multimodal extensions (e.g., vocal tone, facial affect) for more holistic assessments. This work introduces a scalable, data-driven method for evaluating engagement quality of the therapy session, offering clinicians real-time feedback to enhance the quality of both virtual and in-person therapeutic interactions.
Fault Diagnosis of 3D-Printed Scaled Wind Turbine Blades
Esquivel-Sancho, Luis Miguel, Tehrani, Maryam Ghandchi, Muñoz-Arias, Mauricio, Askari, Mahmoud
This study presents an integrated methodology for fault detection in wind turbine blades using 3D-printed scaled models, finite element simulations, experimental modal analysis, and machine learning techniques. A scaled model of the NREL 5MW blade was fabricated using 3D printing, and crack-type damages were introduced at critical locations. Finite Element Analysis was employed to predict the impact of these damages on the natural frequencies, with the results validated through controlled hammer impact tests. Vibration data was processed to extract both time-domain and frequency-domain features, and key discriminative variables were identified using statistical analyses (ANOVA). Machine learning classifiers, including Support Vector Machine and K-Nearest Neighbors, achieved classification accuracies exceeding 94%. The results revealed that vibration modes 3, 4, and 6 are particularly sensitive to structural anomalies for this blade. This integrated approach confirms the feasibility of combining numerical simulations with experimental validations and paves the way for structural health monitoring systems in wind energy applications.
Unsupervised Anomaly Detection for Autonomous Robots via Mahalanobis SVDD with Audio-IMU Fusion
Yang, Yizhuo, Zhao, Jiulin, Xu, Xinhang, Cao, Kun, Yuan, Shenghai, Xie, Lihua
--Reliable anomaly detection is essential for ensuring the safety of autonomous robots, particularly when conventional detection systems based on vision or LiDAR become unreliable in adverse or unpredictable conditions. In such scenarios, alternative sensing modalities are needed to provide timely and robust feedback. T o this end, we explore the use of audio and inertial measurement unit (IMU) sensors to detect underlying anomalies in autonomous mobile robots, such as collisions and internal mechanical faults. Furthermore, to address the challenge of limited labeled anomaly data, we propose an unsupervised anomaly detection framework based on Mahalanobis Support V ector Data Description (M-SVDD). In contrast to conventional SVDD methods that rely on Euclidean distance and assume isotropic feature distributions, our approach employs the Mahalanobis distance to adaptively scale feature dimensions and capture inter-feature correlations, enabling more expressive decision boundaries. In addition, a reconstruction-based auxiliary branch is introduced to preserve feature diversity and prevent representation collapse, further enhancing the robustness of anomaly detection. Extensive experiments on a collected mobile robot dataset and four public datasets demonstrate the effectiveness of the proposed method, as shown in the video https://youtu.be/yh1tn6DDD4A. NOMAL Y detection is essential for ensuring the safety and reliability of many safety-critical systems, including industrial automation [1], aerospace [2] and robotic systems [3], [4]. In the context of autonomous mobile robots, the ability to detect anomalies, such as mechanical faults or unexpected collisions, is important for maintaining safe operations and preventing potential hazards. With the increasing deployment of robots in dynamic and unstructured environments, robust anomaly detection has become an indispensable component of autonomous operation.
Crowding Out The Noise: Algorithmic Collective Action Under Differential Privacy
Solanki, Rushabh, Bhange, Meghana, Aïvodji, Ulrich, Creager, Elliot
The integration of AI into daily life has generated considerable attention and excitement, while also raising concerns about automating algorithmic harms and re-entrenching existing social inequities. While the responsible deployment of trustworthy AI systems is a worthy goal, there are many possible ways to realize it, from policy and regulation to improved algorithm design and evaluation. In fact, since AI trains on social data, there is even a possibility for everyday users, citizens, or workers to directly steer its behavior through Algorithmic Collective Action, by deliberately modifying the data they share with a platform to drive its learning process in their favor. This paper considers how these grassroots efforts to influence AI interact with methods already used by AI firms and governments to improve model trustworthiness. In particular, we focus on the setting where the AI firm deploys a differentially private model, motivated by the growing regulatory focus on privacy and data protection. We investigate how the use of Differentially Private Stochastic Gradient Descent (DPSGD) affects the collective's ability to influence the learning process. Our findings show that while differential privacy contributes to the protection of individual data, it introduces challenges for effective algorithmic collective action. We characterize lower bounds on the success of algorithmic collective action under differential privacy as a function of the collective's size and the firm's privacy parameters, and verify these trends experimentally by simulating collective action during the training of deep neural network classifiers across several datasets.
scDrugMap: Benchmarking Large Foundation Models for Drug Response Prediction
Wang, Qing, Pan, Yining, Zhou, Minghao, Tang, Zijia, Wang, Yanfei, Wang, Guangyu, Song, Qianqian
Drug resistance presents a major challenge in cancer therapy. Single cell profiling offers insights into cellular heterogeneity, yet the application of large-scale foundation models for predicting drug response in single cell data remains underexplored. To address this, we developed scDrugMap, an integrated framework featuring both a Python command-line interface and a web server for drug response prediction. scDrugMap evaluates a wide range of foundation models, including eight single-cell models and two large language models, using a curated dataset of over 326,000 cells in the primary collection and 18,800 cells in the validation set, spanning 36 datasets and diverse tissue and cancer types. We benchmarked model performance under pooled-data and cross-data evaluation settings, employing both layer freezing and Low-Rank Adaptation (LoRA) fine-tuning strategies. In the pooled-data scenario, scFoundation achieved the best performance, with mean F1 scores of 0.971 (layer freezing) and 0.947 (fine-tuning), outperforming the lowest-performing model by over 50%. In the cross-data setting, UCE excelled post fine-tuning (mean F1: 0.774), while scGPT led in zero-shot learning (mean F1: 0.858). Overall, scDrugMap provides the first large-scale benchmark of foundation models for drug response prediction in single-cell data and serves as a user-friendly, flexible platform for advancing drug discovery and translational research.
GaMNet: A Hybrid Network with Gabor Fusion and NMamba for Efficient 3D Glioma Segmentation
Ye, Chengwei, Zhang, Huanzhen, Lin, Yufei, Wang, Kangsheng, Xu, Linuo, Liu, Shuyan
Gliomas are aggressive brain tumors that pose serious health risks. Deep learning aids in lesion segmentation, but CNN and Transformer-based models often lack context modeling or demand heavy computation, limiting real-time use on mobile medical devices. We propose GaMNet, integrating the NMamba module for global modeling and a multi-scale CNN for efficient local feature extraction. To improve interpretability and mimic the human visual system, we apply Gabor filters at multiple scales. Our method achieves high segmentation accuracy with fewer parameters and faster computation. Extensive experiments show GaMNet outperforms existing methods, notably reducing false positives and negatives, which enhances the reliability of clinical diagnosis.
Perils of Label Indeterminacy: A Case Study on Prediction of Neurological Recovery After Cardiac Arrest
Schoeffer, Jakob, De-Arteaga, Maria, Elmer, Jonathan
The design of AI systems to assist human decision-making typically requires the availability of labels to train and evaluate supervised models. Frequently, however, these labels are unknown, and different ways of estimating them involve unverifiable assumptions or arbitrary choices. In this work, we introduce the concept of label indeterminacy and derive important implications in high-stakes AI-assisted decision-making. We present an empirical study in a healthcare context, focusing specifically on predicting the recovery of comatose patients after resuscitation from cardiac arrest. Our study shows that label indeterminacy can result in models that perform similarly when evaluated on patients with known labels, but vary drastically in their predictions for patients where labels are unknown. After demonstrating crucial ethical implications of label indeterminacy in this high-stakes context, we discuss takeaways for evaluation, reporting, and design.