Goto

Collaborating Authors

 Fredericton


Long-Sequence Memory with Temporal Kernels and Dense Hopfield Functionals

arXiv.org Artificial Intelligence

In this study we introduce a novel energy functional for long-sequence memory, building upon the framework of dense Hopfield networks which achieves exponential storage capacity through higher-order interactions. Building upon earlier work on long-sequence Hopfield memory models, we propose a temporal kernal $K(m, k)$ to incorporate temporal dependencies, enabling efficient sequential retrieval of patterns over extended sequences. We demonstrate the successful application of this technique for the storage and sequential retrieval of movies frames which are well suited for this because of the high dimensional vectors that make up each frame creating enough variation between even sequential frames in the high dimensional space. The technique has applications in modern transformer architectures, including efficient long-sequence modeling, memory augmentation, improved attention with temporal bias, and enhanced handling of long-term dependencies in time-series data. Our model offers a promising approach to address the limitations of transformers in long-context tasks, with potential implications for natural language processing, forecasting, and beyond.


A Study on Semi-Supervised Detection of DDoS Attacks under Class Imbalance

arXiv.org Artificial Intelligence

One of the most difficult challenges in cybersecurity is eliminating Distributed Denial of Service (DDoS) attacks. Automating this task using artificial intelligence is a complex process due to the inherent class imbalance and lack of sufficient labeled samples of real-world datasets. This research investigates the use of Semi-Supervised Learning (SSL) techniques to improve DDoS attack detection when data is imbalanced and partially labeled. In this process, 13 state-of-the-art SSL algorithms are evaluated for detecting DDoS attacks in several scenarios. We evaluate their practical efficacy and shortcomings, including the extent to which they work in extreme environments. The results will offer insight into designing intelligent Intrusion Detection Systems (IDSs) that are robust against class imbalance and handle partially labeled data.


A Framework for Non-Linear Attention via Modern Hopfield Networks

arXiv.org Machine Learning

In this work we propose an energy functional along the lines of Modern Hopfield Networks (MNH), the stationary points of which correspond to the attention due to Vaswani et al. [12], thus unifying both frameworks. The minima of this landscape form "context wells" - stable configurations that encapsulate the contextual relationships among tokens. A compelling picture emerges: across $n$ token embeddings an energy landscape is defined whose gradient corresponds to the attention computation. Non-linear attention mechanisms offer a means to enhance the capabilities of transformer models for various sequence modeling tasks by improving the model's understanding of complex relationships, learning of representations, and overall efficiency and performance. A rough analogy can be seen via cubic splines which offer a richer representation of non-linear data where a simpler linear model may be inadequate. This approach can be used for the introduction of non-linear heads in transformer based models such as BERT, [6], etc.


Heart2Mind: Human-Centered Contestable Psychiatric Disorder Diagnosis System using Wearable ECG Monitors

arXiv.org Artificial Intelligence

Psychiatric disorders affect millions globally, yet their diagnosis faces significant challenges in clinical practice due to subjective assessments and accessibility concerns, leading to potential delays in treatment. To help address this issue, we present Heart2Mind, a human-centered contestable psychiatric disorder diagnosis system using wearable electrocardiogram (ECG) monitors. Our approach leverages cardiac biomarkers, particularly heart rate variability (HRV) and R-R intervals (RRI) time series, as objective indicators of autonomic dysfunction in psychiatric conditions. The system comprises three key components: (1) a Cardiac Monitoring Interface (CMI) for real-time data acquisition from Polar H9/H10 devices; (2) a Multi-Scale Temporal-Frequency Transformer (MSTFT) that processes RRI time series through integrated time-frequency domain analysis; (3) a Contestable Diagnosis Interface (CDI) combining Self-Adversarial Explanations (SAEs) with contestable Large Language Models (LLMs). Our MSTFT achieves 91.7% accuracy on the HRV-ACC dataset using leave-one-out cross-validation, outperforming state-of-the-art methods. SAEs successfully detect inconsistencies in model predictions by comparing attention-based and gradient-based explanations, while LLMs enable clinicians to validate correct predictions and contest erroneous ones. This work demonstrates the feasibility of combining wearable technology with Explainable Artificial Intelligence (XAI) and contestable LLMs to create a transparent, contestable system for psychiatric diagnosis that maintains clinical oversight while leveraging advanced AI capabilities. Our implementation is publicly available at: https://github.com/Analytics-Everywhere-Lab/heart2mind.


Adaptive Security Policy Management in Cloud Environments Using Reinforcement Learning

arXiv.org Artificial Intelligence

The securit y of cloud environments, such as Amazon Web Services (AWS), is complex and dynamic. St atic security policies have be come inadequate as threats evolve and cloud resources exhibit elasticity [1]. This paper addresses the limitations of static policies by proposing a security policy management framework that uses reinforcement learning (RL) to adapt dynamically. Specifically, we employ deep reinforcement learni ng algorithms, including deep Q Networks and proximal polic y op timization, enabling the learning and continuous adjustment of controls such as firewall rules and Identity an d Access Management (IAM) poli cies. The proposed RL based solution leverages cloud telemetry data (AWS Cloud Trail logs, network traffic data, threat intelligence feeds) to continuously refine security policies, maximizing threat mitigation, and compliance while minimizing resource impact. Experimental results d emonstrate that our adaptive RL bas ed framework significantly out performs static policies, achieving higher intrusion detection rates (92 % compared to 82% for static policies) and substantially reducing incident detection and response times by 58%. In a ddition, it maintains high con formity with security requirements and efficient resource usage. I. INTRODUCTION Cloud security is a critical concern as more orga nizations rely on cloud infras tructure. AWS an d other cloud platforms provide security configurations such as firewall rules and IAM policies, which are typically managed through static policies set by administrators. However, static policies cannot adapt to the dynamic nature of cloud environments, where workloads, users, and attack patterns change rapidly [1]. This rigidity exposes cloud deployments to new threats or misconfigurations that are not covered by static rules. For instance, static firewall rules may fail to detect novel attack patterns, and fixed IAM roles may become over privileged as resources scale, increasing risk . Problem Statement: Traditional cloud security policy management cannot keep pace with evolving threats and agile DevOps practices. M anual policy updates are error prone and slow.


Almost Bayesian: The Fractal Dynamics of Stochastic Gradient Descent

arXiv.org Artificial Intelligence

We show that the behavior of stochastic gradient descent is related to Bayesian statistics by showing that SGD is effectively diffusion on a fractal landscape, where the fractal dimension can be accounted for in a purely Bayesian way. By doing this we show that SGD can be regarded as a modified Bayesian sampler which accounts for accessibility constraints induced by the fractal structure of the loss landscape. We verify our results experimentally by examining the diffusion of weights during training. These results offer insight into the factors which determine the learning process, and seemingly answer the question of how SGD and purely Bayesian sampling are related.


Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets

arXiv.org Artificial Intelligence

Large language models (LLMs) and transformer-based architectures are increasingly utilized for source code analysis. As software systems grow in complexity, integrating LLMs into code analysis workflows becomes essential for enhancing efficiency, accuracy, and automation. This paper explores the role of LLMs for different code analysis tasks, focusing on three key aspects: 1) what they can analyze and their applications, 2) what models are used and 3) what datasets are used, and the challenges they face. Regarding the goal of this research, we investigate scholarly articles that explore the use of LLMs for source code analysis to uncover research developments, current trends, and the intellectual structure of this emerging field. Additionally, we summarize limitations and highlight essential tools, datasets, and key challenges, which could be valuable for future work.


Pruning-Based TinyML Optimization of Machine Learning Models for Anomaly Detection in Electric Vehicle Charging Infrastructure

arXiv.org Artificial Intelligence

With the growing need for real-time processing on IoT devices, optimizing machine learning (ML) models' size, latency, and computational efficiency is essential. This paper investigates a pruning method for anomaly detection in resource-constrained environments, specifically targeting Electric Vehicle Charging Infrastructure (EVCI). Using the CICEVSE2024 dataset, we trained and optimized three models-Multi-Layer Perceptron (MLP), Long Short-Term Memory (LSTM), and XGBoost-through hyperparameter tuning with Optuna, further refining them using SHapley Additive exPlanations (SHAP)-based feature selection (FS) and unstructured pruning techniques. The optimized models achieved significant reductions in model size and inference times, with only a marginal impact on their performance. Notably, our findings indicate that, in the context of EVCI, pruning and FS can enhance computational efficiency while retaining critical anomaly detection capabilities.


A Kolmogorov-Arnold Network for Explainable Detection of Cyberattacks on EV Chargers

arXiv.org Artificial Intelligence

The increasing adoption of Electric Vehicles (EVs) and the expansion of charging infrastructure and their reliance on communication expose Electric Vehicle Supply Equipment (EVSE) to cyberattacks. This paper presents a novel Kolmogorov-Arnold Network (KAN)-based framework for detecting cyberattacks on EV chargers using only power consumption measurements. Leveraging the KAN's capability to model nonlinear, high-dimensional functions and its inherently interpretable architecture, the framework effectively differentiates between normal and malicious charging scenarios. The model is trained offline on a comprehensive dataset containing over 100,000 cyberattack cases generated through an experimental setup. Once trained, the KAN model can be deployed within individual chargers for real-time detection of abnormal charging behaviors indicative of cyberattacks. Our results demonstrate that the proposed KAN-based approach can accurately detect cyberattacks on EV chargers with Precision and F1-score of 99% and 92%, respectively, outperforming existing detection methods. Additionally, the proposed KANs's enable the extraction of mathematical formulas representing KAN's detection decisions, addressing interpretability, a key challenge in deep learning-based cybersecurity frameworks. This work marks a significant step toward building secure and explainable EV charging infrastructure.


UNB StepUP: A footStep database for gait analysis and recognition using Underfoot Pressure

arXiv.org Artificial Intelligence

Gait refers to the patterns of limb movement generated during walking, which are unique to each individual due to both physical and behavioural traits. Walking patterns have been widely studied in biometrics, biomechanics, sports, and rehabilitation. While traditional methods rely on video and motion capture, advances in underfoot pressure sensing technology now offer deeper insights into gait. However, underfoot pressures during walking remain underexplored due to the lack of large, publicly accessible datasets. To address this, the UNB StepUP database was created, featuring gait pressure data collected with high-resolution pressure sensing tiles (4 sensors/cm$^2$, 1.2m by 3.6m). Its first release, UNB StepUP-P150, includes over 200,000 footsteps from 150 individuals across various walking speeds (preferred, slow-to-stop, fast, and slow) and footwear types (barefoot, standard shoes, and two personal shoes). As the largest and most comprehensive dataset of its kind, it supports biometric gait recognition while presenting new research opportunities in biomechanics and deep learning. The UNB StepUP-P150 dataset sets a new benchmark for pressure-based gait analysis and recognition. Please note that the hypertext links to the dataset on FigShare remain dormant while the document is under review.