AI Agentic Vulnerability Injection And Transformation with Optimized Reasoning

Lbath, Amine, Amini, Massih-Reza, Delaitre, Aurelien, Okun, Vadim

Nov-10-2025–arXiv.org Artificial Intelligence

Abstract--The increasing complexity of software systems and the sophistication of cyber-attacks have underscored the critical need for effective automated vulnerability detection and repair systems. Data-driven approaches using deep learning models show promise but critically depend on the availability of large, accurately labeled datasets. Y et existing datasets either suffer from noisy labels, limited range of vulnerabilities, or fail to reflect vulnerabilities as they occur in real-world software. This also limits large-scale benchmarking of such solutions. Automated vulnerability injection provides a way to directly address these dataset limitations, but existing techniques remain limited in coverage, contextual fidelity, or injection success rates. In this paper, we present A VIA TOR, the first AI-agentic vulnerability injection workflow. It automatically injects realistic, category-specific vulnerabilities for high-fidelity, diverse, large-scale vulnerability dataset generation. Unlike prior monolithic approaches, A VIA TOR orchestrates specialized AI agents, function agents and traditional code analysis tools, that replicate expert reasoning. It combines semantic analysis, injection synthesis enhanced with LoRA-based fine-tuning and Retrieval-Augmented Generation, as well as post-injection validation via static analysis and LLM-based discriminators. This modular decomposition allows specialized agents to focus on distinct tasks, improving robustness of injection and reducing error propagation across the workflow. Evaluations across three distinct benchmarks demonstrate that A VIA TOR achieves 91%-95% injection success rates, significantly surpassing existing automated dataset generation techniques in both accuracy and scope of software vulnerabilities. The rapid growth in software complexity, coupled with the sophistication of cyber-attacks, poses a significant threat to the global security and stability of digital infrastructures. In 2024 alone, the total number of publicly reported vulnerabilities rose by 25% [1]. Software vulnerabilities refer to weaknesses in system security requirements, design, implementation, or operation, that could be accidentally triggered or intentionally exploited, resulting in a violation of the system's security policy [2].

machine learning, natural language, vulnerability, (20 more...)

arXiv.org Artificial Intelligence

Nov-10-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)
- Europe > France (0.46)

Genre:
- Research Report > New Finding (0.93)
- Workflow (0.89)

Industry:
- Information Technology > Security & Privacy (1.00)
- Government > Military
  - Cyberwarfare (0.54)

Technology:
- Information Technology
  - Security & Privacy (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Natural Language (1.00)
    - Machine Learning > Neural Networks
      - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found