Performance Analysis
Machine Learning the 6d Supergravity Landscape
Brady, Nathan, Tennyson, David, Vandermeulen, Thomas
In this paper, we apply both supervised and unsupervised machine learning algorithms to the study of the string landscape and swampland in 6-dimensions. Our data are the (almost) anomaly-free 6-dimensional $\mathcal{N} = (1,0)$ supergravity models, characterised by the Gram matrix of anomaly coefficients. Our work demonstrates the ability of machine learning algorithms to efficiently learn highly complex features of the landscape and swampland. Employing an autoencoder for unsupervised learning, we provide an auto-classification of these models by compressing the Gram matrix data to 2-dimensions. Through compression, similar models cluster together, and we identify prominent features of these clusters. The autoencoder also identifies outlier models which are difficult to reconstruct. One of these outliers proves to be incredibly difficult to combine with other models such that the $\text{tr}R^{4}$ anomaly vanishes, making its presence in the landscape extremely rare. Further, we utilise supervised learning to build two classifiers predicting (1) model consistency under probe string insertion (precision: 0.78, predicting consistency for 214,837 models with reasonable certainty) and (2) inconsistency under anomaly inflow (precision: 0.91, predicting inconsistency for 1,909,359 models). Notably, projecting these predictions onto the autoencoder's 2-dimensional latent layer shows consistent models clustering together, further indicating that the autoencoder has learnt interesting and complex features of the set of models and potentially offers a novel approach to mapping the landscape and swampland of 6-dimensional supergravity theories.
ARCANE -- Early Detection of Interplanetary Coronal Mass Ejections
Rรผdisser, H. T., Nguyen, G., Louรซdec, J. Le, Davies, E. E., Mรถstl, C.
Interplanetary coronal mass ejections (ICMEs) are major drivers of space weather disturbances, posing risks to both technological infrastructure and human activities. Automatic detection of ICMEs in solar wind in situ data is essential for early warning systems. While several methods have been proposed to identify these structures in time series data, robust real-time detection remains a significant challenge. In this work, we present ARCANE - the first framework explicitly designed for early ICME detection in streaming solar wind data under realistic operational constraints, enabling event identification without requiring observation of the full structure. Our approach evaluates the strengths and limitations of detection models by comparing a machine learning-based method to a threshold-based baseline. The ResUNet++ model, previously validated on science data, significantly outperforms the baseline, particularly in detecting high-impact events, while retaining solid performance on lower-impact cases. Notably, we find that using real-time solar wind (RTSW) data instead of high-resolution science data leads to only minimal performance degradation. Despite the challenges of operational settings, our detection pipeline achieves an F1-Score of 0.37, with an average detection delay of 24.5% of the event's duration while processing only a minimal portion of the event data. As more data becomes available, the performance increases significantly. These results mark a substantial step forward in automated space weather monitoring and lay the groundwork for enhanced real-time forecasting capabilities.
Assessing AI-Generated Questions' Alignment with Cognitive Frameworks in Educational Assessment
Yaacoub, Antoun, Da-Rugna, Jรฉrรดme, Assaghir, Zainab
This study evaluates the integration of Bloom's Taxonomy into OneClickQuiz, an Artificial Intelligence (AI) driven plugin for automating Multiple-Choice Question (MCQ) generation in Moodle. Bloom's Taxonomy provides a structured framework for categorizing educational objectives into hierarchical cognitive levels. Our research investigates whether incorporating this taxonomy can improve the alignment of AI-generated questions with specific cognitive objectives. We developed a dataset of 3691 questions categorized according to Bloom's levels and employed various classification models-Multinomial Logistic Regression, Naive Bayes, Linear Support Vector Classification (SVC), and a Transformer-based model (DistilBERT)-to evaluate their effectiveness in categorizing questions. Our results indicate that higher Bloom's levels generally correlate with increased question length, Flesch-Kincaid Grade Level (FKGL), and Lexical Density (LD), reflecting the increased complexity of higher cognitive demands. Multinomial Logistic Regression showed varying accuracy across Bloom's levels, performing best for "Knowledge" and less accurately for higher-order levels. Merging higher-level categories improved accuracy for complex cognitive tasks. Naive Bayes and Linear SVC also demonstrated effective classification for lower levels but struggled with higher-order tasks. DistilBERT achieved the highest performance, significantly improving classification of both lower and higher-order cognitive levels, achieving an overall validation accuracy of 91%. This study highlights the potential of integrating Bloom's Taxonomy into AI-driven assessment tools and underscores the advantages of advanced models like DistilBERT for enhancing educational content generation.
Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices
King, Evan, Sabra, Adam, Kudlur, Manjunath, Wang, James, Warden, Pete
We present the Flavors of Moonshine, a suite of tiny automatic speech recognition (ASR) models specialized for a range of underrepresented languages. Prevailing wisdom suggests that multilingual ASR models outperform monolingual counterparts by exploiting cross-lingual phonetic similarities. We challenge this assumption, showing that for sufficiently small models (27M parameters), training monolingual systems on a carefully balanced mix of high-quality human-labeled, pseudo-labeled, and synthetic data yields substantially superior performance. On average, our models achieve error rates 48% lower than the comparably sized Whisper Tiny model, outperform the 9x larger Whisper Small model, and in most cases match or outperform the 28x larger Whisper Medium model. These results advance the state of the art for models of this size, enabling accurate on-device ASR for languages that previously had limited support. We release Arabic, Chinese, Japanese, Korean, Ukrainian, and Vietnamese Moonshine models under a permissive open-source license.
ESTM: An Enhanced Dual-Branch Spectral-Temporal Mamba for Anomalous Sound Detection
Ma, Chengyuan, Jia, Peng, Guo, Hongyue, Yang, Wenming
The core challenge in industrial equipment anoma lous sound detection (ASD) lies in modeling the time-frequency coupling characteristics of acoustic features. Existing modeling methods are limited by local receptive fields, making it difficult to capture long-range temporal patterns and cross-band dynamic coupling effects in machine acoustic features. In this paper, we propose a novel framework, ESTM, which is based on a dual-path Mamba architecture with time-frequency decoupled modeling and utilizes Selective State-Space Models (SSM) for long-range sequence modeling. ESTM extracts rich feature representations from different time segments and frequency bands by fusing enhanced Mel spectrograms and raw audio features, while further improving sensitivity to anomalous patterns through the TriStat-Gating (TSG) module. Our experiments demonstrate that ESTM improves anomalous detection performance on the DCASE 2020 Task 2 dataset, further validating the effectiveness of the proposed method.
Real-time ML-based Defense Against Malicious Payload in Reconfigurable Embedded Systems
Stahle-Smith, Rye, Karakchi, Rasha
The growing use of FPGAs in reconfigurable systems introducessecurity risks through malicious bitstreams that could cause denial-of-service (DoS), data leakage, or covert attacks. We investigated chip-level hardware malicious payload in embedded systems and proposed a supervised machine learning method to detect malicious bitstreams via static byte-level features. Our approach diverges from existing methods by analyzing bitstreams directly at the binary level, enabling real-time detection without requiring access to source code or netlists. Bitstreams were sourced from state-of-the-art (SOTA) benchmarks and re-engineered to target the Xilinx PYNQ-Z1 FPGA Development Board. Our dataset included 122 samples of benign and malicious configurations. The data were vectorized using byte frequency analysis, compressed using TSVD, and balanced using SMOTE to address class imbalance. The evaluated classifiers demonstrated that Random Forest achieved a macro F1-score of 0.97, underscoring the viability of real-time Trojan detection on resource-constrained systems. The final model was serialized and successfully deployed via PYNQ to enable integrated bitstream analysis.
Extrapolated Markov Chain Oversampling Method for Imbalanced Text Classification
Avela, Aleksi, Ilmonen, Pauliina
Text classification is the task of automatically assigning text documents correct labels from a predefined set of categories. In real-life (text) classification tasks, observations and misclassification costs are often unevenly distributed between the classes - known as the problem of imbalanced data. Synthetic oversampling is a popular approach to imbalanced classification. The idea is to generate synthetic observations in the minority class to balance the classes in the training set. Many general-purpose oversampling methods can be applied to text data; however, imbalanced text data poses a number of distinctive difficulties that stem from the unique nature of text compared to other domains. One such factor is that when the sample size of text increases, the sample vocabulary (i.e., feature space) is likely to grow as well. We introduce a novel Markov chain based text oversampling method. The transition probabilities are estimated from the minority class but also partly from the majority class, thus allowing the minority feature space to expand in oversampling. We evaluate our approach against prominent oversampling methods and show that our approach is able to produce highly competitive results against the other methods in several real data examples, especially when the imbalance is severe.
SegFormer Fine-Tuning with Dropout: Advancing Hair Artifact Removal in Skin Lesion Analysis
Saad, Asif Mohammed, Mahi, Umme Niraj
Hair artifacts in dermoscopic images present significant challenges for accurate skin lesion analysis, potentially obscuring critical diagnostic features in dermatological assessments. This work introduces a fine-tuned SegFormer model augmented with dropout regularization to achieve precise hair mask segmentation. The proposed SegformerWithDropout architecture leverages the MiT-B2 encoder, pretrained on ImageNet, with an in-channel count of 3 and 2 output classes, incorporating a dropout probability of 0.3 in the segmentation head to prevent overfitting. Training is conducted on a specialized dataset of 500 dermoscopic skin lesion images with fine-grained hair mask annotations, employing 10-fold cross-validation, AdamW optimization with a learning rate of 0.001, and cross-entropy loss. Early stopping is applied based on validation loss, with a patience of 3 epochs and a maximum of 20 epochs per fold. Performance is evaluated using a comprehensive suite of metrics, including Intersection over Union (IoU), Dice coefficient, Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Learned Perceptual Image Patch Similarity (LPIPS). Experimental results from the cross-validation demonstrate robust performance, with average Dice coefficients reaching approximately 0.96 and IoU values of 0.93, alongside favorable PSNR (around 34 dB), SSIM (0.97), and low LPIPS (0.06), highlighting the model's effectiveness in accurate hair artifact segmentation and its potential to enhance preprocessing for downstream skin cancer detection tasks.
EgoTouch: On-Body Touch Input Using AR/VR Headset Cameras
Mollyn, Vimal, Harrison, Chris
In augmented and virtual reality (AR/VR) experiences, a user's arms and hands can provide a convenient and tactile surface for touch input. Prior work has shown on-body input to have significant speed, accuracy, and ergonomic benefits over in-air interfaces, which are common today. In this work, we demonstrate high accuracy, bare hands (i.e., no special instrumentation of the user) skin input using just an RGB camera, like those already integrated into all modern XR headsets. Our results show this approach can be accurate, and robust across diverse lighting conditions, skin tones, and body motion (e.g., input while walking). Finally, our pipeline also provides rich input metadata including touch force, finger identification, angle of attack, and rotation. We believe these are the requisite technical ingredients to more fully unlock on-skin interfaces that have been well motivated in the HCI literature but have lacked robust and practical methods.
BM-CL: Bias Mitigation through the lens of Continual Learning
Mansilla, Lucas, Echeveste, Rodrigo, Gonzalez, Camila, Milone, Diego H., Ferrante, Enzo
Biases in machine learning pose significant challenges, particularly when models amplify disparities that affect disadvantaged groups. Traditional bias mitigation techniques often lead to a {\itshape leveling-down effect}, whereby improving outcomes of disadvantaged groups comes at the expense of reduced performance for advantaged groups. This study introduces Bias Mitigation through Continual Learning (BM-CL), a novel framework that leverages the principles of continual learning to address this trade-off. We postulate that mitigating bias is conceptually similar to domain-incremental continual learning, where the model must adjust to changing fairness conditions, improving outcomes for disadvantaged groups without forgetting the knowledge that benefits advantaged groups. Drawing inspiration from techniques such as Learning without Forgetting and Elastic Weight Consolidation, we reinterpret bias mitigation as a continual learning problem. This perspective allows models to incrementally balance fairness objectives, enhancing outcomes for disadvantaged groups while preserving performance for advantaged groups. Experiments on synthetic and real-world image datasets, characterized by diverse sources of bias, demonstrate that the proposed framework mitigates biases while minimizing the loss of original knowledge. Our approach bridges the fields of fairness and continual learning, offering a promising pathway for developing machine learning systems that are both equitable and effective.