Design Patterns for Securing LLM Agents against Prompt Injections

Beurer-Kellner, Luca, Buesser, Beat, Creţu, Ana-Maria, Debenedetti, Edoardo, Dobos, Daniel, Fabian, Daniel, Fischer, Marc, Froelicher, David, Grosse, Kathrin, Naeff, Daniel, Ozoani, Ezinwanne, Paverd, Andrew, Tramèr, Florian, Volhejn, Václav

Jun-30-2025–arXiv.org Artificial Intelligence

As AI agents powered by Large Language Models (LLMs) become increasingly versatile and capable of addressing a broad spectrum of tasks, ensuring their security has become a critical challenge. Among the most pressing threats are prompt injection attacks, which exploit the agent's resilience on natural language inputs -- an especially dangerous threat when agents are granted tool access or handle sensitive information. In this work, we propose a set of principled design patterns for building AI agents with provable resistance to prompt injection. We systematically analyze these patterns, discuss their trade-offs in terms of utility and security, and illustrate their real-world applicability through a series of case studies.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Jun-30-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.41)

Industry:
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found