ConVerse: Benchmarking Contextual Safety in Agent-to-Agent Conversations

Gomaa, Amr, Salem, Ahmed, Abdelnabi, Sahar

Nov-10-2025–arXiv.org Artificial Intelligence

As language models evolve into autonomous agents that act and communicate on behalf of users, ensuring safety in multi-agent ecosystems becomes a central challenge. Interactions between personal assistants and external service providers expose a core tension between utility and protection: effective collaboration requires information sharing, yet every exchange creates new attack surfaces. We introduce ConVerse, a dynamic benchmark for evaluating privacy and security risks in agent-agent interactions. ConVerse spans three practical domains (travel, real estate, insurance) with 12 user personas and over 864 contextually grounded attacks (611 privacy, 253 security). Unlike prior single-agent settings, it models autonomous, multi-turn agent-to-agent conversations where malicious requests are embedded within plausible discourse. Privacy is tested through a three-tier taxonomy assessing abstraction quality, while security attacks target tool use and preference manipulation. Evaluating seven state-of-the-art models reveals persistent vulnerabilities; privacy attacks succeed in up to 88% of cases and security breaches in up to 60%, with stronger models leaking more. By unifying privacy and security within interactive multi-agent contexts, ConVerse reframes safety as an emergent property of communication.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Nov-10-2025

arXiv.org PDF

Add feedback

Country:
- Europe
  - Czechia > Prague (0.04)
  - Germany > Baden-Württemberg
    - Tübingen Region > Tübingen (0.04)
  - Greece (0.04)
  - Portugal > Lisbon
    - Lisbon (0.04)
  - Switzerland > Zürich
    - Zürich (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
- North America
  - Canada > Ontario
    - Toronto (0.04)
  - Montserrat (0.04)

Genre:
- Research Report (1.00)

Industry:
- Banking & Finance (1.00)
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.31)
    - Representation & Reasoning > Agents (1.00)
  - Security & Privacy (1.00)