AITopics | York County

Collaborating Authors

York County

A Study on Semi-Supervised Detection of DDoS Attacks under Class Imbalance

Hallaji, Ehsan, Shanmugam, Vaishnavi, Razavi-Far, Roozbeh, Saif, Mehrdad

arXiv.org Artificial IntelligenceJul-1-2025

One of the most difficult challenges in cybersecurity is eliminating Distributed Denial of Service (DDoS) attacks. Automating this task using artificial intelligence is a complex process due to the inherent class imbalance and lack of sufficient labeled samples of real-world datasets. This research investigates the use of Semi-Supervised Learning (SSL) techniques to improve DDoS attack detection when data is imbalanced and partially labeled. In this process, 13 state-of-the-art SSL algorithms are evaluated for detecting DDoS attacks in several scenarios. We evaluate their practical efficacy and shortcomings, including the extent to which they work in extreme environments. The results will offer insight into designing intelligent Intrusion Detection Systems (IDSs) that are robust against class imbalance and handle partially labeled data.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2506.22949

Country:

North America > Canada > Ontario > Essex County > Windsor (0.04)
North America > Canada > New Brunswick > York County > Fredericton (0.04)
North America > Canada > New Brunswick > Fredericton (0.04)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.48)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.37)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.35)

Add feedback

Heart2Mind: Human-Centered Contestable Psychiatric Disorder Diagnosis System using Wearable ECG Monitors

Nguyen, Hung, Rahimi, Alireza, Whitford, Veronica, Fournier, Hélène, Kondratova, Irina, Richard, René, Cao, Hung

arXiv.org Artificial IntelligenceMay-20-2025

Psychiatric disorders affect millions globally, yet their diagnosis faces significant challenges in clinical practice due to subjective assessments and accessibility concerns, leading to potential delays in treatment. To help address this issue, we present Heart2Mind, a human-centered contestable psychiatric disorder diagnosis system using wearable electrocardiogram (ECG) monitors. Our approach leverages cardiac biomarkers, particularly heart rate variability (HRV) and R-R intervals (RRI) time series, as objective indicators of autonomic dysfunction in psychiatric conditions. The system comprises three key components: (1) a Cardiac Monitoring Interface (CMI) for real-time data acquisition from Polar H9/H10 devices; (2) a Multi-Scale Temporal-Frequency Transformer (MSTFT) that processes RRI time series through integrated time-frequency domain analysis; (3) a Contestable Diagnosis Interface (CDI) combining Self-Adversarial Explanations (SAEs) with contestable Large Language Models (LLMs). Our MSTFT achieves 91.7% accuracy on the HRV-ACC dataset using leave-one-out cross-validation, outperforming state-of-the-art methods. SAEs successfully detect inconsistencies in model predictions by comparing attention-based and gradient-based explanations, while LLMs enable clinicians to validate correct predictions and contest erroneous ones. This work demonstrates the feasibility of combining wearable technology with Explainable Artificial Intelligence (XAI) and contestable LLMs to create a transparent, contestable system for psychiatric diagnosis that maintains clinical oversight while leveraging advanced AI capabilities. Our implementation is publicly available at: https://github.com/Analytics-Everywhere-Lab/heart2mind.

disorder, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2505.11612

Country:

North America > Canada > New Brunswick > York County > Fredericton (0.14)
North America > Canada > New Brunswick > Westmorland County > Moncton (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(5 more...)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.45)
Research Report > Experimental Study (0.45)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.67)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Automated Meta Prompt Engineering for Alignment with the Theory of Mind

Baughman, Aaron, Agarwal, Rahul, Morales, Eduardo, Akay, Gozde

arXiv.org Artificial IntelligenceMay-15-2025

We introduce a method of meta-prompting that jointly produces fluent text for complex tasks while optimizing the similarity of neural states between a human's mental expectation and a Large Language Model's (LLM) neural processing. A technique of agentic reinforcement learning is applied, in which an LLM as a Judge (LLMaaJ) teaches another LLM, through in-context learning, how to produce content by interpreting the intended and unintended generated text traits. To measure human mental beliefs around content production, users modify long form AI-generated text articles before publication at the US Open 2024 tennis Grand Slam. Now, an LLMaaJ can solve the Theory of Mind (ToM) alignment problem by anticipating and including human edits within the creation of text from an LLM. Throughout experimentation and by interpreting the results of a live production system, the expectations of human content reviewers had 100% of alignment with AI 53.8% of the time with an average iteration count of 4.38. The geometric interpretation of content traits such as factualness, novelty, repetitiveness, and relevancy over a Hilbert vector space combines spatial volume (all trait importance) with vertices alignment (individual trait relevance) enabled the LLMaaJ to optimize on Human ToM. This resulted in an increase in content quality by extending the coverage of tennis action. Our work that was deployed at the US Open 2024 has been used across other live events within sports and entertainment.

dimension, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.09024

Country:

North America > United States > North Carolina > Wake County > Cary (0.40)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts (0.04)
(4 more...)

Genre: Research Report (0.85)

Industry: Leisure & Entertainment > Sports > Tennis (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.32)

Add feedback

Large Language Models (LLMs) for Source Code Analysis: applications, models and datasets

Jelodar, Hamed, Meymani, Mohammad, Razavi-Far, Roozbeh

arXiv.org Artificial IntelligenceMar-21-2025

Large language models (LLMs) and transformer-based architectures are increasingly utilized for source code analysis. As software systems grow in complexity, integrating LLMs into code analysis workflows becomes essential for enhancing efficiency, accuracy, and automation. This paper explores the role of LLMs for different code analysis tasks, focusing on three key aspects: 1) what they can analyze and their applications, 2) what models are used and 3) what datasets are used, and the challenges they face. Regarding the goal of this research, we investigate scholarly articles that explore the use of LLMs for source code analysis to uncover research developments, current trends, and the intellectual structure of this emerging field. Additionally, we summarize limitations and highlight essential tools, datasets, and key challenges, which could be valuable for future work.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.17502

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts (0.04)
North America > Canada > New Brunswick > York County > Fredericton (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Attackers Can Do Better: Over- and Understated Factors of Model Stealing Attacks

Oliynyk, Daryna, Mayer, Rudolf, Rauber, Andreas

arXiv.org Artificial IntelligenceMar-8-2025

Machine learning models were shown to be vulnerable to model stealing attacks, which lead to intellectual property infringement. Among other methods, substitute model training is an all-encompassing attack applicable to any machine learning model whose behaviour can be approximated from input-output queries. Whereas prior works mainly focused on improving the performance of substitute models by, e.g. developing a new substitute training method, there have been only limited ablation studies on the impact the attacker's strength has on the substitute model's performance. As a result, different authors came to diverse, sometimes contradicting, conclusions. In this work, we exhaustively examine the ambivalent influence of different factors resulting from varying the attacker's capabilities and knowledge on a substitute training attack. Our findings suggest that some of the factors that have been considered important in the past are, in fact, not that influential; instead, we discover new correlations between attack conditions and success rate. In particular, we demonstrate that better-performing target models enable higher-fidelity attacks and explain the intuition behind this phenomenon. Further, we propose to shift the focus from the complexity of target models toward the complexity of their learning tasks. Therefore, for the substitute model, rather than aiming for a higher architecture complexity, we suggest focusing on getting data of higher complexity and an appropriate architecture. Finally, we demonstrate that even in the most limited data-free scenario, there is no need to overcompensate weak knowledge with millions of queries. Our results often exceed or match the performance of previous attacks that assume a stronger attacker, suggesting that these stronger attacks are likely endangering a model owner's intellectual property to a significantly higher degree than shown until now.

attacker, substitute model, target model, (15 more...)

arXiv.org Artificial Intelligence

2503.06188

Country:

Europe > Austria > Vienna (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(23 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

UNB StepUP: A footStep database for gait analysis and recognition using Underfoot Pressure

Larracy, Robyn, Phinyomark, Angkoon, Salehi, Ala, MacDonald, Eve, Kazemi, Saeed, Bashar, Shikder Shafiul, Tabor, Aaron, Scheme, Erik

arXiv.org Artificial IntelligenceFeb-26-2025

Gait refers to the patterns of limb movement generated during walking, which are unique to each individual due to both physical and behavioural traits. Walking patterns have been widely studied in biometrics, biomechanics, sports, and rehabilitation. While traditional methods rely on video and motion capture, advances in underfoot pressure sensing technology now offer deeper insights into gait. However, underfoot pressures during walking remain underexplored due to the lack of large, publicly accessible datasets. To address this, the UNB StepUP database was created, featuring gait pressure data collected with high-resolution pressure sensing tiles (4 sensors/cm$^2$, 1.2m by 3.6m). Its first release, UNB StepUP-P150, includes over 200,000 footsteps from 150 individuals across various walking speeds (preferred, slow-to-stop, fast, and slow) and footwear types (barefoot, standard shoes, and two personal shoes). As the largest and most comprehensive dataset of its kind, it supports biometric gait recognition while presenting new research opportunities in biomechanics and deep learning. The UNB StepUP-P150 dataset sets a new benchmark for pressure-based gait analysis and recognition. Please note that the hypertext links to the dataset on FigShare remain dormant while the document is under review.

dataset, footstep, participant, (15 more...)

arXiv.org Artificial Intelligence

2502.17244

Country:

North America > Canada > New Brunswick > York County > Fredericton (0.04)
North America > Canada > New Brunswick > Fredericton (0.04)
North America > United States > Massachusetts > Middlesex County > Natick (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (0.67)
Health & Medicine > Consumer Health (0.67)
Health & Medicine > Therapeutic Area > Neurology (0.46)
Health & Medicine > Therapeutic Area > Musculoskeletal (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

TrustChain: A Blockchain Framework for Auditing and Verifying Aggregators in Decentralized Federated Learning

Hallaji, Ehsan, Razavi-Far, Roozbeh, Saif, Mehrdad

arXiv.org Artificial IntelligenceFeb-22-2025

--The server-less nature of Decentralized Federated Learning (DFL) requires allocating the aggregation role to specific participants in each federated round. Current DFL architectures ensure the trustworthiness of the aggregator node upon selection. However, most of these studies overlook the possibility that the aggregating node may turn rogue and act maliciously after being nominated. T o address this problem, this paper proposes a DFL structure, called TrustChain, that scores the aggregators before selection based on their past behavior and additionally audits them after the aggregation. T o do this, the statistical independence between the client updates and the aggregated model is continuously monitored using the Hilbert-Schmidt Independence Criterion (HSIC). The proposed method relies on several principles, including blockchain, anomaly detection, and concept drift analysis. The designed structure is evaluated on several federated datasets and attack scenarios with different numbers of Byzantine nodes. HE advent of Federated Learning (FL) advanced the field of distributed machine learning by introducing data decentralization as a solution to bring about data privacy and communication efficiency [1]. Despite its advantages, FL was shown to be vulnerable against a spectrum of adversaries due to its distributed nature [2], [3]. Numerous research endeavors have been dedicated to studying these threats and finding robust defense mechanisms to mitigate them. A common perception among the majority of these studies is that the server is trustworthy, and malicious activities can potentially be initiated from the edge nodes.

aggregator, federated learning, node, (15 more...)

arXiv.org Artificial Intelligence

2502.16406

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > Canada > Ontario > Essex County > Windsor (0.04)
North America > Canada > New Brunswick > York County > Fredericton (0.04)
North America > Canada > New Brunswick > Fredericton (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

FedNIA: Noise-Induced Activation Analysis for Mitigating Data Poisoning in FL

Hallaji, Ehsan, Razavi-Far, Roozbeh, Saif, Mehrdad

arXiv.org Artificial IntelligenceFeb-22-2025

Federated learning systems are increasingly threatened by data poisoning attacks, where malicious clients compromise global models by contributing tampered updates. Existing defenses often rely on impractical assumptions, such as access to a central test dataset, or fail to generalize across diverse attack types, particularly those involving multiple malicious clients working collaboratively. To address this, we propose Federated Noise-Induced Activation Analysis (FedNIA), a novel defense framework to identify and exclude adversarial clients without relying on any central test dataset. FedNIA injects random noise inputs to analyze the layerwise activation patterns in client models leveraging an autoencoder that detects abnormal behaviors indicative of data poisoning. FedNIA can defend against diverse attack types, including sample poisoning, label flipping, and backdoors, even in scenarios with multiple attacking nodes. Experimental results on non-iid federated datasets demonstrate its effectiveness and robustness, underscoring its potential as a foundational approach for enhancing the security of federated learning systems.

federated learning, fednia, poisoning attack, (10 more...)

arXiv.org Artificial Intelligence

2502.16396

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > Orange County > Anaheim (0.04)
North America > Canada > Ontario > Essex County > Windsor (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Federated Continual Learning: Concepts, Challenges, and Solutions

Hamedi, Parisa, Razavi-Far, Roozbeh, Hallaji, Ehsan

arXiv.org Artificial IntelligenceFeb-10-2025

Federated Continual Learning (FCL) has emerged as a robust solution for collaborative model training in dynamic environments, where data samples are continuously generated and distributed across multiple devices. This survey provides a comprehensive review of FCL, focusing on key challenges such as heterogeneity, model stability, communication overhead, and privacy preservation. We explore various forms of heterogeneity and their impact on model performance. Solutions to non-IID data, resource-constrained platforms, and personalized learning are reviewed in an effort to show the complexities of handling heterogeneous data distributions. Next, we review techniques for ensuring model stability and avoiding catastrophic forgetting, which are critical in non-stationary environments. Privacy-preserving techniques are another aspect of FCL that have been reviewed in this work. This survey has integrated insights from federated learning and continual learning to present strategies for improving the efficacy and scalability of FCL systems, making it applicable to a wide range of real-world scenarios.

data mining, knowledge management, machine learning, (23 more...)

arXiv.org Artificial Intelligence

2502.07059

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > New Brunswick > Fredericton (0.04)
(11 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education > Educational Technology > Educational Software (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Knowledge Management (1.00)
(11 more...)

Add feedback

A Study on the Importance of Features in Detecting Advanced Persistent Threats Using Machine Learning

Hallaji, Ehsan, Razavi-Far, Roozbeh, Saif, Mehrdad

arXiv.org Artificial IntelligenceFeb-10-2025

Advanced Persistent Threats (APTs) pose a significant security risk to organizations and industries. These attacks often lead to severe data breaches and compromise the system for a long time. Mitigating these sophisticated attacks is highly challenging due to the stealthy and persistent nature of APTs. Machine learning models are often employed to tackle this challenge by bringing automation and scalability to APT detection. Nevertheless, these intelligent methods are data-driven, and thus, highly affected by the quality and relevance of input data. This paper aims to analyze measurements considered when recording network traffic and conclude which features contribute more to detecting APT samples. To do this, we study the features associated with various APT cases and determine their importance using a machine learning framework. To ensure the generalization of our findings, several feature selection techniques are employed and paired with different classifiers to evaluate their effectiveness. Our findings provide insights into how APT detection can be enhanced in real-world scenarios.

artificial intelligence, dll, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2502.07207

Country:

Asia > China (0.04)
North America > United States (0.04)
North America > Canada > Ontario > Essex County > Windsor (0.04)
(9 more...)

Genre: Research Report > New Finding (0.69)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Military (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback