Goto

Collaborating Authors

 ssti


CyberRAG: An Agentic RAG cyber attack classification and reporting tool

Blefari, Francesco, Cosentino, Cristian, Pironti, Francesco Aurelio, Furfaro, Angelo, Marozzo, Fabrizio

arXiv.org Artificial Intelligence

Intrusion Detection and Prevention Systems (IDS/IPS) in large enterprises can generate hundreds of thousands of alerts per hour, overwhelming analysts with logs requiring rapidly evolving expertise. Conventional machine-learning detectors reduce alert volume but still yield many false positives, while standard Retrieval-Augmented Generation (RAG) pipelines often retrieve irrelevant context and fail to justify predictions. We present CyberRAG, a modular agent-based RAG framework that delivers real-time classification, explanation, and structured reporting for cyber-attacks. A central LLM agent orchestrates: (i) fine-tuned classifiers specialized by attack family; (ii) tool adapters for enrichment and alerting; and (iii) an iterative retrieval-and-reason loop that queries a domain-specific knowledge base until evidence is relevant and self-consistent. Unlike traditional RAG, CyberRAG adopts an agentic design that enables dynamic control flow and adaptive reasoning. This architecture autonomously refines threat labels and natural-language justifications, reducing false positives and enhancing interpretability. It is also extensible: new attack types can be supported by adding classifiers without retraining the core agent. CyberRAG was evaluated on SQL Injection, XSS, and SSTI, achieving over 94\% accuracy per class and a final classification accuracy of 94.92\% through semantic orchestration. Generated explanations reached 0.94 in BERTScore and 4.9/5 in GPT-4-based expert evaluation, with robustness preserved against adversarial and unseen payloads. These results show that agentic, specialist-oriented RAG can combine high detection accuracy with trustworthy, SOC-ready prose, offering a flexible path toward partially automated cyber-defense workflows.


LoRA Users Beware: A Few Spurious Tokens Can Manipulate Your Finetuned Model

Salles, Marcel Mateos, Goyal, Praney, Sekhsaria, Pradyut, Huang, Hai, Balestriero, Randall

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are commonly finetuned for a variety of use cases and domains. A common approach is to leverage Low-Rank Adaptation (LoRA) -- known to provide strong performance at low resource costs. In this study, we demonstrate that LoRA actually opens the door to short-cut vulnerabilities -- and the more resource efficient is the LoRA setup, the more vulnerable will be the finetuned model to aggressive attacks. To measure that vulnerability, we introduce Seamless Spurious Token Injection (SSTI), where we find that LoRA exclusively focuses on even just a single token that is spuriously correlated with downstream labels. In short, injection of that spurious token during finetuning ensure that the model's prediction at test-time can be manipulated on-demand. We conducted experiments across model families and datasets to evaluate the impact of SSTI during LoRA finetuning while providing possible mitigations. Our experiments conclude that none of the existing checkers and preprocessors can sanitize a dataset raising new concerns for data quality and AI safety.