Multi-Agent Systems Execute Arbitrary Malicious Code
Triedman, Harold, Jha, Rishi, Shmatikov, Vitaly
–arXiv.org Artificial Intelligence
Multi-agent systems coordinate LLM-based agents to perform tasks on users' behalf. In real-world applications, multi-agent systems will inevitably interact with untrusted inputs, such as malicious Web content, files, email attachments, etc. Using several recently proposed multi-agent frameworks as concrete examples, we demonstrate that adversarial content can hijack control and communication within the system to invoke unsafe agents and functionalities. This results in a complete security breach, up to execution of arbitrary malicious code on the user's device and/or exfiltration of sensitive data from the user's containerized environment. We show that control-flow hijacking attacks succeed even if the individual agents are not susceptible to direct or indirect prompt injection, and even if they refuse to perform harmful actions.
arXiv.org Artificial Intelligence
Mar-15-2025
- Country:
- Africa > Eswatini
- Asia > Middle East
- Palestine > Gaza Strip > Rafah Governorate > Rafah (0.04)
- Europe
- Poland > Kuyavian-Pomeranian Province
- Bydgoszcz (0.04)
- Slovenia > Drava
- Municipality of Benedikt > Benedikt (0.04)
- Switzerland > Basel-City
- Basel (0.04)
- Poland > Kuyavian-Pomeranian Province
- North America > United States
- California > Santa Clara County
- Palo Alto (0.04)
- New York > New York County
- New York City (0.04)
- California > Santa Clara County
- Genre:
- Research Report (0.84)
- Workflow (0.67)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: