Goto

Collaborating Authors

 Performance Analysis


Efficient dataset construction using active learning and uncertainty-aware neural networks for plasma turbulent transport surrogate models

arXiv.org Artificial Intelligence

This work demonstrates a proof-of-principle for using uncertainty-aware architectures, in combination with active learning techniques and an in-the-loop physics simulation code as a data labeller, to construct efficient datasets for data-driven surrogate model generation. Building off of a previous proof-of-principle successfully demonstrating training set reduction on static pre-labelled datasets, using the ADEPT framework, this strategy was applied again to the plasma turbulent transport problem within tokamak fusion plasmas, specifically the QuaLiKiz quasilinear electrostatic gyrokinetic turbulent transport code. While QuaLiKiz provides relatively fast evaluations, this study specifically targeted small datasets to serve as a proxy for more expensive codes, such as CGYRO or GENE. The newly implemented algorithm uses the SNGP architecture for the classification component of the problem and the BNN-NCP architecture for the regression component, training models for all turbulent modes (ITG, TEM, ETG) and all transport fluxes ($Q_e$, $Q_i$, $ฮ“_e$, $ฮ“_i$, and $ฮ _i$) described by the general QuaLiKiz output. With 45 active learning iterations, moving from a small initial training set of $10^{2}$ to a final set of $10^{4}$, the resulting models reached a $F_1$ classification performance of ~0.8 and a $R^2$ regression performance of ~0.75 on an independent test set across all outputs. This extrapolates to reaching the same performance and efficiency as the previous ADEPT pipeline, although on a problem with 1 extra input dimension. While the improvement rate achieved in this implementation diminishes faster than expected, the overall technique is formulated with components that can be upgraded and generalized to many surrogate modeling applications beyond plasma turbulent transport predictions.


VCDiag: Classifying Erroneous Waveforms for Failure Triage Acceleration

arXiv.org Artificial Intelligence

Failure triage in design functional verification is critical but time-intensive, relying on manual specification reviews, log inspections, and waveform analyses. While machine learning (ML) has improved areas like stimulus generation and coverage closure, its application to RTL-level simulation failure triage, particularly for large designs, remains limited. VCDiag offers an efficient, adaptable approach using VCD data to classify failing waveforms and pinpoint likely failure locations. In the largest experiment, VCDiag achieves over 94% accuracy in identifying the top three most likely modules. The framework introduces a novel signal selection and statistical compression approach, achieving over 120x reduction in raw data size while preserving features essential for classification. It can also be integrated into diverse Verilog/SystemVerilog designs and testbenches.


Single Domain Generalization in Diabetic Retinopathy: A Neuro-Symbolic Learning Approach

arXiv.org Artificial Intelligence

Domain generalization remains a critical challenge in medical imaging, where models trained on single sources often fail under real-world distribution shifts. We propose KG-DG, a neuro-symbolic framework for diabetic retinopathy (DR) classification that integrates vision transformers with expert-guided symbolic reasoning to enable robust generalization across unseen domains. Our approach leverages clinical lesion ontologies through structured, rule-based features and retinal vessel segmentation, fusing them with deep visual representations via a confidence-weighted integration strategy. The framework addresses both single-domain generalization (SDG) and multi-domain generalization (MDG) by minimizing the KL divergence between domain embeddings, thereby enforcing alignment of high-level clinical semantics. Extensive experiments across four public datasets (APTOS, EyePACS, Messidor-1, Messidor-2) demonstrate significant improvements: up to a 5.2% accuracy gain in cross-domain settings and a 6% improvement over baseline ViT models. Notably, our symbolic-only model achieves a 63.67% average accuracy in MDG, while the complete neuro-symbolic integration achieves the highest accuracy compared to existing published baselines and benchmarks in challenging SDG scenarios. Ablation studies reveal that lesion-based features (84.65% accuracy) substantially outperform purely neural approaches, confirming that symbolic components act as effective regularizers beyond merely enhancing interpretability. Our findings establish neuro-symbolic integration as a promising paradigm for building clinically robust, and domain-invariant medical AI systems.


Enhancing Machine Learning for Imbalanced Medical Data: A Quantum-Inspired Approach to Synthetic Oversampling (QI-SMOTE)

arXiv.org Artificial Intelligence

Class imbalance remains a critical challenge in machine learning (ML), particularly in the medical domain, where underrepresented minority classes lead to biased models and reduced predictive performance. This study introduces Quantum-Inspired SMOTE (QI-SMOTE), a novel data augmentation technique that enhances the performance of ML classifiers, including Random Forest (RF), Support Vector Machine (SVM), Logistic Regression (LR), k-Nearest Neighbors (KNN), Gradient Boosting (GB), and Neural Networks, by leveraging quantum principles such as quantum evolution and layered entanglement. Unlike conventional oversampling methods, QI-SMOTE generates synthetic instances that preserve complex data structures, improving model generalization and classification accuracy. We validate QI-SMOTE on the MIMIC-III and MIMIC-IV datasets, using mortality detection as a benchmark task due to their clinical significance and inherent class imbalance. We compare our method against traditional oversampling techniques, including Borderline-SMOTE, ADASYN, SMOTE-ENN, SMOTE-TOMEK, and SVM-SMOTE, using key performance metrics such as Accuracy, F1-score, G-Mean, and AUC-ROC. The results demonstrate that QI-SMOTE significantly improves the effectiveness of ensemble methods (RF, GB, ADA), kernel-based models (SVM), and deep learning approaches by producing more informative and balanced training data. By integrating quantum-inspired transformations into the ML pipeline, QI-SMOTE not only mitigates class imbalance but also enhances the robustness and reliability of predictive models in medical diagnostics and decision-making. This study highlights the potential of quantum-inspired resampling techniques in advancing state-of-the-art ML methodologies.


Speech DF Arena: A Leaderboard for Speech DeepFake Detection Models

arXiv.org Artificial Intelligence

Parallel to the development of advanced deepfake audio generation, audio deepfake detection has also seen significant progress. However, a standardized and comprehensive benchmark is still missing. To address this, we introduce Speech DeepFake (DF) Arena, the first comprehensive benchmark for audio deepfake detection. Speech DF Arena provides a toolkit to uniformly evaluate detection systems, currently across 14 diverse datasets and attack scenarios, standardized evaluation metrics and protocols for reproducibility and transparency. It also includes a leaderboard to compare and rank the systems to help researchers and developers enhance their reliability and robustness. We include 14 evaluation sets, 12 state-of-the-art open-source and 3 proprietary detection systems. Our study presents many systems exhibiting high EER in out-of-domain scenarios, highlighting the need for extensive cross-domain evaluation. The leaderboard is hosted on Huggingface1 and a toolkit for reproducing results across the listed datasets is available on GitHub.


Multi-Scale Deep Learning for Colon Histopathology: A Hybrid Graph-Transformer Approach

arXiv.org Artificial Intelligence

Colon cancer also known as Colorectal cancer, is one of the most malignant types of cancer worldwide. Early-stage detection of colon cancer is highly crucial to prevent its deterioration. This research presents a hybrid multi-scale deep learning architecture that synergizes capsule networks, graph attention mechanisms, transformer modules, and residual learning to advance colon cancer classification on the Lung and Colon Cancer Histopathological Image Dataset (LC25000) dataset. The proposed model in this paper utilizes the HG-TNet model that introduces a hybrid architecture that joins strength points in transformers and convolutional neural networks to capture multi-scale features in histopathological images. Mainly, a transformer branch extracts global contextual bonds by partitioning the image into patches by convolution-based patch embedding and then processing these patches through a transformer encoder. Analogously, a dedicated CNN branch captures fine-grained, local details through successive Incorporation these diverse features, combined with a self-supervised rotation prediction objective, produce a robust diagnostic representation that surpasses standard architectures in performance. Results show better performance not only in accuracy or loss function but also in these algorithms by utilizing capsule networks to preserve spatial orders and realize how each element individually combines and forms whole structures.


A Single Detect Focused YOLO Framework for Robust Mitotic Figure Detection

arXiv.org Artificial Intelligence

Mitotic figure detection is a crucial task in computational pathology, as mitotic activity serves as a strong prognostic marker for tumor aggressiveness. However, domain variability that arises from differences in scanners, tissue types, and staining protocols poses a major challenge to the robustness of automated detection methods. In this study, we introduce SDF-YOLO (Single Detect Focused YOLO), a lightweight yet domain-robust detection framework designed specifically for small, rare targets such as mitotic figures. The model builds on YOLOv11 with task-specific modifications, including a single detection head aligned with mitotic figure scale, coordinate attention to enhance positional sensitivity, and improved cross-channel feature mixing. Experiments were conducted on three datasets that span human and canine tumors: MIDOG ++, canine cutaneous mast cell tumor (CCMCT), and canine mammary carcinoma (CMC). When submitted to the preliminary test set for the MIDOG2025 challenge, SDF-YOLO achieved an average precision (AP) of 0.799, with a precision of 0.758, a recall of 0.775, an F1 score of 0.766, and an FROC-AUC of 5.793, demonstrating both competitive accuracy and computational efficiency. These results indicate that SDF-YOLO provides a reliable and efficient framework for robust mitotic figure detection across diverse domains.


Lessons Learned from Deploying Adaptive Machine Learning Agents with Limited Data for Real-time Cell Culture Process Monitoring

arXiv.org Artificial Intelligence

This study explores the deployment of three machine learning (ML) approaches for real-time prediction of glucose, lactate, and ammonium concentrations in cell culture processes, using Raman spectroscopy as input features. The research addresses challenges associated with limited data availability and process variability, providing a comparative analysis of pretrained models, just-in-time learning (JITL), and online learning algorithms. Two industrial case studies are presented to evaluate the impact of varying bioprocess conditions on model performance. The findings highlight the specific conditions under which pretrained models demonstrate superior predictive accuracy and identify scenarios where JITL or online learning approaches are more effective for adaptive process monitoring. This study also highlights the critical importance of updating the deployed models/agents with the latest offline analytical measurements during bioreactor operations to maintain the model performance against the changes in cell growth behaviours and operating conditions throughout the bioreactor run. Additionally, the study confirms the usefulness of a simple mixture-of-experts framework in achieving enhanced accuracy and robustness for real-time predictions of metabolite concentrations based on Raman spectral data. These insights contribute to the development of robust strategies for the efficient deployment of ML models in dynamic and changing biomanufacturing environments.


Beyond Synthetic Augmentation: Group-Aware Threshold Calibration for Robust Balanced Accuracy in Imbalanced Learning

arXiv.org Artificial Intelligence

Class imbalance remains a fundamental challenge in machine learning, with traditional solutions often creating as many problems as they solve. We demonstrate that group-aware threshold calibration--setting different decision thresholds for different demographic groups--provides superior robustness compared to synthetic data generation methods. Through extensive experiments, we show that group-specific thresholds achieve 1.5-4% higher balanced accuracy than SMOTE and CT-GAN augmented models while improving worst-group balanced accuracy. Unlike single-threshold approaches that apply one cutoff across all groups, our group-aware method optimizes the Pareto frontier between balanced accuracy and worst-group balanced accuracy, enabling fine-grained control over group-level performance. Critically, we find that applying group thresholds to synthetically augmented data yields minimal additional benefit, suggesting these approaches are fundamentally redundant. Our results span seven model families including linear, tree-based, instance-based, and boosting methods, confirming that group-aware threshold calibration offers a simpler, more interpretable, and more effective solution to class imbalance.


Semi-Supervised Bayesian GANs with Log-Signatures for Uncertainty-Aware Credit Card Fraud Detection

arXiv.org Machine Learning

We present a novel deep generative semi-supervised framework for credit card fraud detection, formulated as time series classification task. As financial transaction data streams grow in scale and complexity, traditional methods often require large labeled datasets, struggle with time series of irregular sampling frequencies and varying sequence lengths. To address these challenges, we extend conditional Generative Adversarial Networks (GANs) for targeted data augmentation, integrate Bayesian inference to obtain predictive distributions and quantify uncertainty, and leverage log-signatures for robust feature encoding of transaction histories. We introduce a novel Wasserstein distance-based loss to align generated and real unlabeled samples while simultaneously maximizing classification accuracy on labeled data. Our approach is evaluated on the BankSim dataset, a widely used simulator for credit card transaction data, under varying proportions of labeled samples, demonstrating consistent improvements over benchmarks in both global statistical and domain-specific metrics. These findings highlight the effectiveness of GAN-driven semi-supervised learning with log-signatures for irregularly sampled time series and emphasize the importance of uncertainty-aware predictions.