RedTeamLLM: an Agentic AI framework for offensive security
Challita, Brian, Parrend, Pierre
–arXiv.org Artificial Intelligence
From automated intrusion testing to discovery of zero-day attacks before software launch, agentic AI calls for great promises in security engineering. This strong capability is bound with a similar threat: the security and research community must build up its models before the approach is leveraged by malicious actors for cybercrime. We therefore propose and evaluate RedTeamLLM, an integrated architecture with a comprehensive security model for automatization of pentest tasks. RedTeamLLM follows three key steps: summarizing, reasoning and act, which embed its operational capacity. This novel framework addresses four open challenges: plan correction, memory management, context window constraint, and generality vs. specialization. Evaluation is performed through the automated resolution of a range of entry-level, but not trivial, CTF challenges. The contribution of the reasoning capability of our agentic AI framework is specifically evaluated.
arXiv.org Artificial Intelligence
May-13-2025
- Country:
- Europe (0.28)
- Genre:
- Research Report (0.64)
- Workflow (0.46)
- Industry:
- Government > Military
- Cyberwarfare (0.69)
- Information Technology > Security & Privacy (1.00)
- Government > Military
- Technology: