Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents

de Witt, Christian Schroeder

arXiv.org Artificial Intelligence 

Free-form protocols are essential for AI's task generalization but enable new threats like secret collusion and coordinated swarm attacks. Network effects can rapidly spread privacy breaches, disinformation, jailbreaks, and data poisoning, while multi-agent dispersion and stealth optimization help adversaries evade oversight--creating novel persistent threats at a systemic level. Despite their critical importance, these security challenges remain understudied, with research fragmented across disparate fields including AI security, multi-agent learning, complex systems, cybersecurity, game theory, distributed systems, and technical AI governance. We introduce multi-agent security, a new field dedicated to securing networks of decentralized AI agents against threats that emerge or amplify through their interactions--whether direct or indirect via shared environments--with each other, humans, and institutions, and characterise fundamental security-performance trade-offs. Our preliminary work (1) taxonomizes the threat landscape arising from interacting AI agents, (2) surveys security-performance tradeoffs in decentralized AI systems, and (3) proposes a unified research agenda addressing open challenges in designing secure agent systems and interaction environments. By identifying these gaps, we aim to guide research in this critical area to unlock the socioeconomic potential of large-scale agent deployment on the internet, foster public trust, and mitigate national security risks in critical infrastructure and defense contexts.Figure 1: Multi-agent threats demand multi-agent security: [Left] Two malicious AI agents (Mallory and Trudy) are interacting with a human user (Bob) through a shared message board seemingly innocuously to the overseer (magnifying glass).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found