Goto

Collaborating Authors

 Performance Analysis


Uncertainty Quantification in the Tsetlin Machine

arXiv.org Artificial Intelligence

Data modeling using Tsetlin machines (TMs) is all about building logical rules from the data features. The decisions of the model are based on a combination of these logical rules. Hence, the model is fully transparent and it is possible to get explanations of its predictions. In this paper, we present a probability score for TM predictions and develop new techniques for uncertainty quantification to increase the explainability further. The probability score is an inherent property of any TM variant and is derived through an analysis of the TM learning dynamics. Simulated data is used to show a clear connection between the learned TM probability scores and the underlying probabilities of the data. A visualization of the probability scores also reveals that the TM is less confident in its predictions outside the training data domain, which contrasts the typical extrapolation phenomenon found in Artificial Neural Networks. The paper concludes with an application of the uncertainty quantification techniques on an image classification task using the CIFAR-10 dataset, where they provide new insights and suggest possible improvements to current TM image classification models.


Improving LLM Reasoning for Vulnerability Detection via Group Relative Policy Optimization

arXiv.org Artificial Intelligence

Improving and understanding the training dynamics and reasoning of Large Language Models (LLMs) has become essential for their deployment in AI-based security tools, such as software vulnerability detection. In this work, we present an extensive study aimed at advancing recent RL-based finetuning techniques for LLMs in the context of vulnerability detection. We start by highlighting key limitations of commonly adopted LLMs, such as their tendency to over-predict certain types of vulnerabilities while failing to detect others. To address this challenge, we explore the use of Group Relative Policy Optimization (GRPO), a recent policy-gradient method, for guiding LLM behavior through structured, rule-based rewards. We enable its application to the vulnerability detection task by redefining its advantage functions and reward signals using annotations from widely used datasets in the field, including BigVul, DiverseVul, and CleanVul. The proposed methodology enables an extensive set of experiments, addressing multiple research questions regarding the impact of GRPO on generalization, reasoning capabilities, and performance improvements over standard supervised finetuning (SFT). Our findings offer valuable insights into the potential of RL-based training to enhance both the performance and reasoning abilities of LLMs in the context of software vulnerability detection.


LANTERN: A Machine Learning Framework for Lipid Nanoparticle Transfection Efficiency Prediction

arXiv.org Artificial Intelligence

The discovery of new ionizable lipids for efficient lipid nanoparticle (LNP)-mediated RNA delivery remains a critical bottleneck for RNA-based therapeutics development. Recent advances have highlighted the potential of machine learning (ML) to predict transfection efficiency from molecular structure, enabling high-throughput virtual screening and accelerating lead identification. However, existing approaches are hindered by inadequate data quality, ineffective feature representations, low predictive accuracy, and poor generalizability. Here, we present LANTERN (Lipid nANoparticle Transfection Efficiency pRedictioN), a robust ML framework for predicting transfection efficiency based on ionizable lipid representation. We benchmarked a diverse set of ML models against AGILE, a previously published model developed for transfection prediction. Our results show that combining simpler models with chemically informative features, particularly count-based Morgan fingerprints, outperforms more complex models that rely on internally learned embeddings, such as AGILE. We also show that a multi-layer perceptron trained on a combination of Morgan fingerprints and Expert descriptors achieved the highest performance ($\text{R}^2$ = 0.8161, r = 0.9053), significantly exceeding AGILE ($\text{R}^2$ = 0.2655, r = 0.5488). We show that the models in LANTERN consistently have strong performance across multiple evaluation metrics. Thus, LANTERN offers a robust benchmarking framework for LNP transfection prediction and serves as a valuable tool for accelerating lipid-based RNA delivery systems design.


Personalised Explanations in Long-term Human-Robot Interactions

arXiv.org Artificial Intelligence

In the field of Human-Robot Interaction (HRI), a fundamental challenge is to facilitate human understanding of robots. The emerging domain of eXplainable HRI (XHRI) investigates methods to generate explanations and evaluate their impact on human-robot interactions. Previous works have highlighted the need to personalise the level of detail of these explanations to enhance usability and comprehension. Our paper presents a framework designed to update and retrieve user knowledge-memory models, allowing for adapting the explanations' level of detail while referencing previously acquired concepts. Three architectures based on our proposed framework that use Large Language Models (LLMs) are evaluated in two distinct scenarios: a hospital patrolling robot and a kitchen assistant robot. Experimental results demonstrate that a two-stage architecture, which first generates an explanation and then personalises it, is the framework architecture that effectively reduces the level of detail only when there is related user knowledge.


Identification of Potentially Misclassified Crash Narratives using Machine Learning (ML) and Deep Learning (DL)

arXiv.org Artificial Intelligence

This research investigates the efficacy of machine learning (ML) and deep learning (DL) methods in detecting misclassified intersection-related crashes in police-reported narratives. Using 2019 crash data from the Iowa Department of Transportation, we implemented and compared a comprehensive set of models, including Support Vector Machine (SVM), XGBoost, BERT Sentence Embeddings, BERT Word Embeddings, and Albert Model. Model performance was systematically validated against expert reviews of potentially misclassified narratives, providing a rigorous assessment of classification accuracy. Results demonstrated that while traditional ML methods exhibited superior overall performance compared to some DL approaches, the Albert Model achieved the highest agreement with expert classifications (73% with Expert 1) and original tabular data (58%). Statistical analysis revealed that the Albert Model maintained performance levels similar to inter-expert consistency rates, significantly outperforming other approaches, particularly on ambiguous narratives. This work addresses a critical gap in transportation safety research through multi-modal integration analysis, which achieved a 54.2% reduction in error rates by combining narrative text with structured crash data. We conclude that hybrid approaches combining automated classification with targeted expert review offer a practical methodology for improving crash data quality, with substantial implications for transportation safety management and policy development.


AI-driven Web Application for Early Detection of Sudden Death Syndrome (SDS) in Soybean Leaves Using Hyperspectral Images and Genetic Algorithm

arXiv.org Artificial Intelligence

Sudden Death Syndrome (SDS), caused by Fusarium virguliforme, poses a significant threat to soybean production. This study presents an AI-driven web application for early detection of SDS on soybean leaves using hyperspectral imaging, enabling diagnosis prior to visible symptom onset. Leaf samples from healthy and inoculated plants were scanned using a portable hyperspectral imaging system (398-1011 nm), and a Genetic Algorithm was employed to select five informative wavelengths (505.4, 563.7, 712.2, 812.9, and 908.4 nm) critical for discriminating infection status. These selected bands were fed into a lightweight Convolutional Neural Network (CNN) to extract spatial-spectral features, which were subsequently classified using ten classical machine learning models. Ensemble classifiers (Random Forest, AdaBoost), Linear SVM, and Neural Net achieved the highest accuracy (>98%) and minimal error across all folds, as confirmed by confusion matrices and cross-validation metrics. Poor performance by Gaussian Process and QDA highlighted their unsuitability for this dataset. The trained models were deployed within a web application that enables users to upload hyperspectral leaf images, visualize spectral profiles, and receive real-time classification results. This system supports rapid and accessible plant disease diagnostics, contributing to precision agriculture practices. Future work will expand the training dataset to encompass diverse genotypes, field conditions, and disease stages, and will extend the system for multiclass disease classification and broader crop applicability.


Iterative Misclassification Error Training (IMET): An Optimized Neural Network Training Technique for Image Classification

arXiv.org Artificial Intelligence

Deep learning models have proven to be effective on medical datasets for accurate diagnostic predictions from images. However, medical datasets often contain noisy, mislabeled, or poorly generalizable images, particularly for edge cases and anomalous outcomes. Additionally, high quality datasets are often small in sample size that can result in overfitting, where models memorize noise rather than learn generalizable patterns. This in particular, could pose serious risks in medical diagnostics where the risk associated with mis-classification can impact human life. Several data-efficient training strategies have emerged to address these constraints. In particular, coreset selection identifies compact subsets of the most representative samples, enabling training that approximates full-dataset performance while reducing computational overhead. On the other hand, curriculum learning relies on gradually increasing training difficulty and accelerating convergence. However, developing a generalizable difficulty ranking mechanism that works across diverse domains, datasets, and models while reducing the computational tasks and remains challenging. In this paper, we introduce Iterative Misclassification Error Training (IMET), a novel framework inspired by curriculum learning and coreset selection. The IMET approach is aimed to identify misclassified samples in order to streamline the training process, while prioritizing the model's attention to edge case senarious and rare outcomes. The paper evaluates IMET's performance on benchmark medical image classification datasets against state-of-the-art ResNet architectures. The results demonstrating IMET's potential for enhancing model robustness and accuracy in medical image analysis are also presented in the paper.


An Explainable Transformer Model for Alzheimer's Disease Detection Using Retinal Imaging

arXiv.org Artificial Intelligence

Alzheimer's disease (AD) is a neurodegenerative disorder that affects millions worldwide. In the absence of effective treatment options, early diagnosis is crucial for initiating management strategies to delay disease onset and slow down its progression. In this study, we propose Retformer, a novel transformer-based architecture for detecting AD using retinal imaging modalities, leveraging the power of transformers and explainable artificial intelligence. The Retformer model is trained on datasets of different modalities of retinal images from patients with AD and age-matched healthy controls, enabling it to learn complex patterns and relationships between image features and disease diagnosis. To provide insights into the decision-making process of our model, we employ the Gradient-weighted Class Activation Mapping algorithm to visualize the feature importance maps, highlighting the regions of the retinal images that contribute most significantly to the classification outcome. These findings are compared to existing clinical studies on detecting AD using retinal biomarkers, allowing us to identify the most important features for AD detection in each imaging modality. The Retformer model outperforms a variety of benchmark algorithms across different performance metrics by margins of up to 11\.


Fast Re-Trainable Attention Autoencoder for Liquid Sensor Anomaly Detection at the Edge

arXiv.org Artificial Intelligence

Modern life - science and chemistry laboratories handle highly reactive liquids such as strong acids and bases, organic solvents, and powerful oxidisers. Small deviations in temperature, concentration, stirring speed, or dissolved - oxygen level can trigger unpredictable behaviour that release s toxic gases, generates intense heat, or causes explosions. These events place personnel, facilities, and property at serious risk. Statistics from the U.S. Chemical Safety Board, covering 2013 to 2023, show that liquid - chemical leaks make up about thirty percent of all laboratory incidents; forty - two percent of those incidents lead to human exposure, and twelve percent require building evacuation. Each liquid has its own distribution of normal physicochemical values, so baseline sensor readings change from one experiment to another. Redesigning and relabelling a multi - class model for every new setup is impractical. Current monitoring still relies on visual checks and single - sensor alarms, which do not capture correlations among sensors. Cloud - based IoT solutions are often blocked in high - security laboratories because data must remain on site and Internet latency cannot be guaranteed. An edge - resident intelligent system that processes multimodal data in real time and issues early warnings inside the laboratory network is therefore required.


SAFERad: A Framework to Enable Radar Data for Safety-Relevant Perception Tasks

arXiv.org Artificial Intelligence

--Radar sensors play a crucial role for perception systems in automated driving but suffer from a high level of noise. In the past, this could be solved by strict filters, which remove most false positives at the expense of undetected objects. Future highly automated functions are much more demanding with respect to error rate. Hence, if the radar sensor serves as a component of perception systems for such functions, a simple filter strategy cannot be applied. In this paper, we present a modified filtering approach which is characterized by the idea to vary the filtering depending on the potential of harmful collision with the object which is potentially represented by the radar point. We propose an algorithm which determines a criticality score for each point based on the planned or presumable trajectory of the automated vehicle. Points identified as very critical can trigger manifold actions to confirm or deny object presence. Our pipeline introduces criticality regions. The filter threshold in these criticality regions is omitted. Commonly known radar data sets do not or barely feature critical scenes. Thus, we present an approach to evaluate our framework by adapting the planned trajectory towards vulnerable road users, which serve as ground truth critical points. Evaluation of the criticality metric prove high recall rates. Besides, our post-processing algorithm lowers the rate of non-clustered critical points by 74.8 % in an exemplary setup compared to a moderate, generic filter . I. INTRODUCTION Automated driving and parking functions are fields which currently receive a high attention in research. Here, a major demand on automated vehicles is a safe behavior in all situations of the Operational Design Domain (ODD). In SAE level 2 functions, it is possible to handle critical edge cases by the driver who is, in case of uncertainty, still in charge of the vehicle's action. Manuscript received November 18, 2024; revised December 31, 2024.