Goto

Collaborating Authors

 Performance Analysis


Dude, where's my utterance? Evaluating the effects of automatic segmentation and transcription on CPS detection

arXiv.org Artificial Intelligence

Collaborative Problem-Solving (CPS) markers capture key aspects of effective teamwork, such as staying on task, avoiding interruptions, and generating constructive ideas. An AI system that reliably detects these markers could help teachers identify when a group is struggling or demonstrating productive collaboration. Such a system requires an automated pipeline composed of multiple components. In this work, we evaluate how CPS detection is impacted by automating two critical components: transcription and speech segmentation. On the public Weights Task Dataset (WTD), we find CPS detection performance with automated transcription and segmentation methods is comparable to human-segmented and manually transcribed data; however, we find the automated segmentation methods reduces the number of utterances by 26.5%, impacting the the granularity of the data. We discuss the implications for developing AI-driven tools that support collaborative learning in classrooms.


The Geometries of Truth Are Orthogonal Across Tasks

arXiv.org Machine Learning

Large Language Models (LLMs) have demonstrated impressive generalization capabilities across various tasks, but their claim to practical relevance is still mired by concerns on their reliability. Recent works have proposed examining the activations produced by an LLM at inference time to assess whether its answer to a question is correct. Some works claim that a "geometry of truth" can be learned from examples, in the sense that the activations that generate correct answers can be distinguished from those leading to mistakes with a linear classifier. In this work, we underline a limitation of these approaches: we observe that these "geometries of truth" are intrinsically task-dependent and fail to transfer across tasks. More precisely, we show that linear classifiers trained across distinct tasks share little similarity and, when trained with sparsity-enforcing regularizers, have almost disjoint supports. We show that more sophisticated approaches (e.g., using mixtures of probes and tasks) fail to overcome this limitation, likely because activation vectors commonly used to classify answers form clearly separated clusters when examined across tasks.


Dealing with Uncertainty in Contextual Anomaly Detection

arXiv.org Machine Learning

Contextual anomaly detection (CAD) aims to identify anomalies in a target (behavioral) variable conditioned on a set of contextual variables that influence the normalcy of the target variable but are not themselves indicators of anomaly. In many anomaly detection tasks, there exist contextual variables that influence the normalcy of the target variable but are not themselves indicators of anomaly. In this work, we propose a novel framework for CAD, normalcy score (NS), that explicitly models both the aleatoric and epistemic uncertainties. Built on heteroscedastic Gaussian process regression, our method regards the Z-score as a random variable, providing confidence intervals that reflect the reliability of the anomaly assessment. Through experiments on benchmark datasets and a real-world application in cardiology, we demonstrate that NS outperforms state-of-the-art CAD methods in both detection accuracy and interpretability. Moreover, confidence intervals enable an adaptive, uncertainty-driven decision-making process, which may be very important in domains such as healthcare.


Where to Intervene: Action Selection in Deep Reinforcement Learning

arXiv.org Machine Learning

Deep reinforcement learning (RL) has gained widespread adoption in recent years but faces significant challenges, particularly in unknown and complex environments. Among these, high-dimensional action selection stands out as a critical problem. Existing works often require a sophisticated prior design to eliminate redundancy in the action space, relying heavily on domain expert experience or involving high computational complexity, which limits their generalizability across different RL tasks. In this paper, we address these challenges by proposing a general data-driven action selection approach with model-free and computationally friendly properties. Our method not only selects minimal sufficient actions but also controls the false discovery rate via knockoff sampling. More importantly, we seamlessly integrate the action selection into deep RL methods during online training. Empirical experiments validate the established theoretical guarantees, demonstrating that our method surpasses various alternative techniques in terms of both performance in variable selection and overall achieved rewards.


Model selection for stochastic dynamics: a parsimonious and principled approach

arXiv.org Machine Learning

This thesis focuses on the discovery of stochastic differential equations (SDEs) and stochastic partial differential equations (SPDEs) from noisy and discrete time series. A major challenge is selecting the simplest possible correct model from vast libraries of candidate models, where standard information criteria (AIC, BIC) are often limited. We introduce PASTIS (Parsimonious Stochastic Inference), a new information criterion derived from extreme value theory. Its penalty term, $n_\mathcal{B} \ln(n_0/p)$, explicitly incorporates the size of the initial library of candidate parameters ($n_0$), the number of parameters in the considered model ($n_\mathcal{B}$), and a significance threshold ($p$). This significance threshold represents the probability of selecting a model containing more parameters than necessary when comparing many models. Benchmarks on various systems (Lorenz, Ornstein-Uhlenbeck, Lotka-Volterra for SDEs; Gray-Scott for SPDEs) demonstrate that PASTIS outperforms AIC, BIC, cross-validation (CV), and SINDy (a competing method) in terms of exact model identification and predictive capability. Furthermore, real-world data can be subject to large sampling intervals ($Δt$) or measurement noise ($σ$), which can impair model learning and selection capabilities. To address this, we have developed robust variants of PASTIS, PASTIS-$Δt$ and PASTIS-$σ$, thus extending the applicability of the approach to imperfect experimental data. PASTIS thus provides a statistically grounded, validated, and practical methodological framework for discovering simple models for processes with stochastic dynamics.


Sequential Regression Learning with Randomized Algorithms

arXiv.org Machine Learning

This paper presents ``randomized SINDy", a sequential machine learning algorithm designed for dynamic data that has a time-dependent structure. It employs a probabilistic approach, with its PAC learning property rigorously proven through the mathematical theory of functional analysis. The algorithm dynamically predicts using a learned probability distribution of predictors, updating weights via gradient descent and a proximal algorithm to maintain a valid probability density. Inspired by SINDy (Brunton et al. 2016), it incorporates feature augmentation and Tikhonov regularization. For multivariate normal weights, the proximal step is omitted to focus on parameter estimation. The algorithm's effectiveness is demonstrated through experimental results in regression and binary classification using real-world data.


ATwo-Stage Ensemble Feature Selection and Particle Swarm Optimization Approach for Micro-Array Data Classification in Distributed Computing Environments

arXiv.org Artificial Intelligence

High dimensionality in datasets produced by microarray technology presents a challenge for Machine Learning (ML) algorithms, particularly in terms of dimensionality reduction and handling imbalanced sample sizes. To mitigate the explained problems, we have proposedhybrid ensemble feature selection techniques with majority voting classifier for micro array classi f ication. Here we have considered both filter and wrapper-based feature selection techniques including Mutual Information (MI), Chi-Square, Variance Threshold (VT), Least Absolute Shrinkage and Selection Operator (LASSO), Analysis of Variance (ANOVA), and Recursive Feature Elimination (RFE), followed by Particle Swarm Optimization (PSO) for selecting the optimal features. This Artificial Intelligence (AI) approach leverages a Majority Voting Classifier that combines multiple machine learning models, such as Logistic Regression (LR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost), to enhance overall performance and accuracy. By leveraging the strengths of each model, the ensemble approach aims to provide more reliable and effective diagnostic predictions. The efficacy of the proposed model has been tested in both local and cloud environments. In the cloud environment, three virtual machines virtual Central Processing Unit (vCPU) with size 8,16 and 64 bits, have been used to demonstrate the model performance. From the experiment it has been observed that, virtual Central Processing Unit (vCPU)-64 bits provides better classification accuracies of 95.89%, 97.50%, 99.13%, 99.58%, 99.11%, and 94.60% with six microarray datasets, Mixed Lineage Leukemia (MLL), Leukemia, Small Round Blue Cell Tumors (SRBCT), Lymphoma, Ovarian, andLung,respectively, validating the effectiveness of the proposed modelin bothlocalandcloud environments.


DISPROTBENCH: A Disorder-Aware, Task-Rich Benchmark for Evaluating Protein Structure Prediction in Realistic Biological Contexts

arXiv.org Artificial Intelligence

Recent advances in protein structure prediction have achieved near-atomic accuracy for well-folded proteins. However, current benchmarks inadequately assess model performance in biologically challenging contexts, especially those involving intrinsically disordered regions (IDRs), limiting their utility in applications such as drug discovery, disease variant interpretation, and protein interface design. We introduce DisProtBench, a comprehensive benchmark for evaluating protein structure prediction models (PSPMs) under structural disorder and complex biological conditions. DisProtBench spans three key axes: (1) Data complexity, covering disordered regions, G protein-coupled receptor (GPCR) ligand pairs, and multimeric complexes; (2) Task diversity, benchmarking twelve leading PSPMs across structure-based tasks with unified classification, regression, and interface metrics; and (3) Interpretability, via the DisProtBench Portal, which provides precomputed 3D structures and visual error analyses. Our results reveal significant variability in model robustness under disorder, with low-confidence regions linked to functional prediction failures. Notably, global accuracy metrics often fail to predict task performance in disordered settings, emphasizing the need for function-aware evaluation. DisProtBench establishes a reproducible, extensible, and biologically grounded framework for assessing next-generation PSPMs in realistic biomedical scenarios.


Biaxialformer: Leveraging Channel Independence and Inter-Channel Correlations in EEG Signal Decoding for Predicting Neurological Outcomes

arXiv.org Artificial Intelligence

--Accurate decoding of EEG signals requires comprehensive modeling of both temporal dynamics within individual channels and spatial dependencies across channels. While Transformer-based models utilizing channel-independence (CI) strategies have demonstrated strong performance in various time series tasks, they often overlook the inter-channel correlations that are critical in multivariate EEG signals. This omission can lead to information degradation and reduced prediction accuracy, particularly in complex tasks such as neurological outcome prediction. T o address these challenges, we propose Biaxialformer, characterized by a meticulously engineered two-stage attention-based framework. By employing joint learning of positional encodings, Biaxialformer preserves both temporal and spatial relationships in EEG data, mitigating the inter-channel correlation forgetting problem common in traditional CI models. T o enhance spatial feature extraction, we leverage bipolar EEG signals, which capture inter-hemispheric brain interactions, a critical but often overlooked aspect in EEG analysis. Our study broadens the use of Transformer-based models by addressing the challenge of predicting neurological outcomes in comatose patients. Impact Statement --Decisions about continued treatment for comatose patients hinge on uncertain predictions of brain recovery, leaving families and clinicians in a difficult position. This work delivers a reliable AI-based forecast of recovery chances by analyzing routine EEGs, consistently across multiple hospitals. This clarity can guide doctors toward personalized treatment plans, reduce the performance of invasive or costly procedures with little benefit, and give families timely, trustworthy information when weighing care options. This work was supported in part by the Health South East Authority in Norway, Helse Sør-Øst RHF (HSØ: New Realtime Decision Support during Blood Loss using Machine Learning on Vital Signs) under Grant No. 19/00264-202, and Prosjektnummer 2020079.


Detection of Disengagement from Voluntary Quizzes: An Explainable Machine Learning Approach in Higher Distance Education

arXiv.org Artificial Intelligence

--Students disengaging from their tasks can have serious long-term consequences, including academic drop-out. This is particularly relevant for students in distance education. One way to measure the level of disengagement in distance education is to observe participation in non-mandatory exercises in different online courses. In this paper, we detect student disengagement in the non-mandatory quizzes of 42 courses in four semesters from a distance-based university. We carefully identified the most informative student log data that could be extracted and processed from Moodle. Then, eight machine learning algorithms were trained and compared to obtain the highest possible prediction accuracy. Using the SHAP method, we developed an explainable machine learning framework that allows practitioners to better understand the decisions of the trained algorithm. The experimental results show a balanced accuracy of 91%, where about 85% of disengaged students were correctly detected. On top of the highly predictive performance and explainable framework, we provide a discussion on how to design a timely intervention to minimise disengagement from voluntary tasks in online learning. HE advent of distance education has made learning more flexible than ever before. Instead of having to attend classes and solve tasks at specific time, students are granted more freedom in choosing when to engage with their academic workload. This flexibility attracts many non-traditional student groups to higher education, including students that are employed outside of their studies, either fully or part-time. While deadlines are still set in place, students are responsible themselves for planning and time management, especially as far as non-mandatory tasks and exercises are concerned. This freedom can also lead to satisficing behaviour, meaning students only do the bare minimum to pass their courses (see e.g., [1], [2]). Bergamin are with the Institute for Research in Open-, Distance-and eLearning, Swiss Distance University of Applied Sciences, Brig, CH-3900, Switzerland (e-mail addresses: behnam.parsaeifard@ffhs.ch, N. Bergamin (e-mail address: nicole.bergamin@ffhs.ch) is with Department of Informatics, Swiss Distance University of Applied Sciences, Brig, CH-3900, Switzerland. Bergamin is also with the North-West University, Potchefstroom, 2531, South Africa. The COVID-19 pandemic is thought to have fostered this kind of behaviour even more [4]. Non-completion of voluntary tasks, such as optional quizzes, is a form of behavioural disengagement strongly linked to academic drop-out or attrition [5]-[8].