Agents
Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents
Free-form protocols are essential for AI's task generalization but enable new threats like secret collusion and coordinated swarm attacks. Network effects can rapidly spread privacy breaches, disinformation, jailbreaks, and data poisoning, while multi-agent dispersion and stealth optimization help adversaries evade oversight--creating novel persistent threats at a systemic level. Despite their critical importance, these security challenges remain understudied, with research fragmented across disparate fields including AI security, multi-agent learning, complex systems, cybersecurity, game theory, distributed systems, and technical AI governance. We introduce multi-agent security, a new field dedicated to securing networks of decentralized AI agents against threats that emerge or amplify through their interactions--whether direct or indirect via shared environments--with each other, humans, and institutions, and characterise fundamental security-performance trade-offs. Our preliminary work (1) taxonomizes the threat landscape arising from interacting AI agents, (2) surveys security-performance tradeoffs in decentralized AI systems, and (3) proposes a unified research agenda addressing open challenges in designing secure agent systems and interaction environments. By identifying these gaps, we aim to guide research in this critical area to unlock the socioeconomic potential of large-scale agent deployment on the internet, foster public trust, and mitigate national security risks in critical infrastructure and defense contexts.Figure 1: Multi-agent threats demand multi-agent security: [Left] Two malicious AI agents (Mallory and Trudy) are interacting with a human user (Bob) through a shared message board seemingly innocuously to the overseer (magnifying glass).
Training Environment for High Performance Reinforcement Learning
This paper presents Tunnel, a simple, open source, reinforcement learning training environment for high performance aircraft. It integrates the F16 3D nonlinear flight dynamics into OpenAI Gymnasium python package. The template includes primitives for boundaries, targets, adversaries and sensing capabilities that may vary depending on operational need. This offers mission planners a means to rapidly respond to evolving environments, sensor capabilities and adversaries for autonomous air combat aircraft. It offers researchers access to operationally relevant aircraft physics. Tunnel code base is accessible to anyone familiar with Gymnasium and/or those with basic python skills. This paper includes a demonstration of a week long trade study that investigated a variety of training methods, observation spaces, and threat presentations. This enables increased collaboration between researchers and mission planners which can translate to a national military advantage. As warfare becomes increasingly reliant upon automation, software agility will correlate with decision advantages. Airmen must have tools to adapt to adversaries in this context. It may take months for researchers to develop skills to customize observation, actions, tasks and training methodologies in air combat simulators. In Tunnel, this can be done in a matter of days.
Act Natural! Extending Naturalistic Projection to Multimodal Behavior Scenarios
Khan, Hamzah I., Fridovich-Keil, David
--Autonomous agents operating in public spaces must consider how their behaviors might affect the humans around them, even when not directly interacting with them. T o this end, it is often beneficial to be predictable and appear naturalistic. Existing methods for this purpose use human actor intent modeling or imitation learning techniques, but these approaches rarely capture all possible motivations for human behavior and/or require significant amounts of data. Our work extends a technique for modeling unimodal naturalistic behaviors with an explicit convex set representation, to account for multimodal behavior by using multiple convex sets. This more flexible representation provides a higher degree of fidelity in data-driven modeling of naturalistic behavior that arises in real-world scenarios in which human behavior is, in some sense, discrete, e.g. Equipped with this new set representation, we develop an optimization-based filter to project arbitrary trajectories into the set so that they appear naturalistic to humans in the scene, while also satisfying vehicle dynamics, actuator limits, etc. We demonstrate our methods on real-world human driving data from the inD (intersection) and rounD (roundabout) datasets. Safe and comfortable interaction between humans and autonomous agents requires a measure of predictable and naturalistic behavior from autonomous systems. Autonomous agents in the real world can easily find themselves in unsafe situations when they violate these informal norms of humanlike behavior. As one example, autonomous vehicles are well-documented to behave more cautiously than human drivers expect, which can lead to human drivers reacting unsafely to unexpected or abnormal driving and causing collisions [1]. Thus, autonomous agents must be able to plan and execute naturalistic, human-like behavior. However, naturalistic behavior tends to be challenging to model mathematically because human preferences and decision-making are opaque. Nevertheless, there exists a need for techniques which are able to model the wide variety of naturalistic behavior based on observations of human actions. This work was supported by the National Science Foundation under Grants 2211548 and 2336840. Each nonconvex set is represented by the union of a set of convex hulls which are formed by clustering the trajectory states at each time. Then, we project arbitrary trajectories into this set to make the behaviors more naturalistic.
Pathfinders in the Sky: Formal Decision-Making Models for Collaborative Air Traffic Control in Convective Weather
Choi, Jimin, Anand, Kartikeya, Idris, Husni R., Tran, Huy T., Li, Max Z.
Air traffic can be significantly disrupted by weather. Pathfinder operations involve assigning a designated aircraft to assess whether airspace that was previously impacted by weather can be safely traversed through. Despite relatively routine use in air traffic control, there is little research on the underlying multi-agent decision-making problem. We seek to address this gap herein by formulating decision models to capture the operational dynamics and implications of pathfinders. Specifically, we construct a Markov chain to represent the stochastic transitions between key operational states (e.g., pathfinder selection). We then analyze its steady-state behavior to understand long-term system dynamics. We also propose models to characterize flight-specific acceptance behaviors (based on utility trade-offs) and pathfinder selection strategies (based on sequential offer allocations). We then conduct a worst-case scenario analysis that highlights risks from collective rejection and explores how selfless behavior and uncertainty affect system resilience. Empirical analysis of data from the US Federal Aviation Administration demonstrates the real-world significance of pathfinder operations and informs future model calibration.
TutorGym: A Testbed for Evaluating AI Agents as Tutors and Students
Weitekamp, Daniel, Siddiqui, Momin N., MacLellan, Christopher J.
Recent improvements in large language model (LLM) performance on academic benchmarks, such as MATH and GSM8K, have emboldened their use as standalone tutors and as simulations of human learning. However, these new applications require more than evaluations of final solution generation. We introduce TutorGym to evaluate these applications more directly. TutorGym is a standard interface for testing artificial intelligence (AI) agents within existing intelligent tutoring systems (ITS) that have been tested and refined in classroom studies, including Cognitive Tutors (CTAT), Apprentice Tutors, and OATutors. TutorGym is more than a simple problem-solution benchmark, it situates AI agents within the interactive interfaces of existing ITSs. At each step of problem-solving, AI agents are asked what they would do as a tutor or as a learner. As tutors, AI agents are prompted to provide tutoring support -- such as generating examples, hints, and step-level correctness feedback -- which can be evaluated directly against the adaptive step-by-step support provided by existing ITSs. As students, agents directly learn from ITS instruction, and their mistakes and learning trajectories can be compared to student data. TutorGym establishes a common framework for training and evaluating diverse AI agents, including LLMs, computational models of learning, and reinforcement learning agents, within a growing suite of learning environments. Currently, TutorGym includes 223 different tutor domains. In an initial evaluation, we find that current LLMs are poor at tutoring -- none did better than chance at labeling incorrect actions, and next-step actions were correct only ~52-70% of the time -- but they could produce remarkably human-like learning curves when trained as students with in-context learning.
Safe and Efficient CAV Lane Changing using Decentralised Safety Shields
Hegde, Bharathkumar, Bouroche, Melanie
--Lane changing is a complex decision-making problem for Connected and Autonomous V ehicles (CA Vs) as it requires balancing traffic efficiency with safety. Although traffic efficiency can be improved by using vehicular communication for training lane change controllers using Multi-Agent Reinforcement Learning (MARL), ensuring safety is difficult. T o address this issue, we propose a decentralised Hybrid Safety Shield (HSS) that combines optimisation and a rule-based approach to guarantee safety. Our method applies control barrier functions to constrain longitudinal and lateral control inputs of a CA V to ensure safe manoeuvres. Additionally, we present an architecture to integrate HSS with MARL, called MARL-HSS, to improve traffic efficiency while ensuring safety. We evaluate MARL-HSS using a gym-like environment that simulates an on-ramp merging scenario with two levels of traffic densities, such as light and moderate densities. The results show that HSS provides a safety guarantee by strictly enforcing a dynamic safety constraint defined on a time headway, even in moderate traffic density that offers challenging lane change scenarios. Moreover, the proposed method learns stable policies compared to the baseline, a state-of-the-art MARL lane change controller without a safety shield. Further policy evaluation shows that our method achieves a balance between safety and traffic efficiency with zero crashes and comparable average speeds in light and moderate traffic densities. I NTRODUCTION Autonomous V ehicles (A Vs) were expected to be commercially available by 2020, but recent reports suggest that wider adoption of A Vs can only be expected after 2030 or beyond due to societal, regulatory, and technical challenges [1]. Complex technical problems, such as localisation, mapping, perception, route planning, and motion control, are yet to be solved to enable commercial A V deployments [2].
Modeling AI-Human Collaboration as a Multi-Agent Adaptation
Sen, Prothit, Jakkaraju, Sai Mihir
We develop an agent-based simulation to formalize AI-human collaboration as a function of task structure, advancing a generalizable framework for strategic decision-making in organizations. Distinguishing between heuristic-based human adaptation and rule-based AI search, we model interactions across modular (parallel) and sequenced (interdependent) tasks using an NK model. Our results reveal that in modular tasks, AI often substitutes for humans - delivering higher payoffs unless human expertise is very high, and the AI search space is either narrowly focused or extremely broad. In sequenced tasks, interesting complementarities emerge. When an expert human initiates the search and AI subsequently refines it, aggregate performance is maximized. Conversely, when AI leads, excessive heuristic refinement by the human can reduce payoffs. We also show that even "hallucinatory" AI - lacking memory or structure - can improve outcomes when augmenting low-capability humans by helping escape local optima. These results yield a robust implication: the effectiveness of AI-human collaboration depends less on context or industry, and more on the underlying task structure. By elevating task decomposition as the central unit of analysis, our model provides a transferable lens for strategic decision-making involving humans and an agentic AI across diverse organizational settings.
CBM-RAG: Demonstrating Enhanced Interpretability in Radiology Report Generation with Multi-Agent RAG and Concept Bottleneck Models
Alam, Hasan Md Tusfiqur, Srivastav, Devansh, Selim, Abdulrahman Mohamed, Kadir, Md Abdul, Shuvo, Md Moktadirul Hoque, Sonntag, Daniel
Advancements in generative Artificial Intelligence (AI) hold great promise for automating radiology workflows, yet challenges in interpretability and reliability hinder clinical adoption. This paper presents an automated radiology report generation framework that combines Concept Bottleneck Models (CBMs) with a Multi-Agent Retrieval-Augmented Generation (RAG) system to bridge AI performance with clinical explainability. CBMs map chest X-ray features to human-understandable clinical concepts, enabling transparent disease classification. Meanwhile, the RAG system integrates multi-agent collaboration and external knowledge to produce contextually rich, evidence-based reports. Our demonstration showcases the system's ability to deliver interpretable predictions, mitigate hallucinations, and generate high-quality, tailored reports with an interactive interface addressing accuracy, trust, and usability challenges. This framework provides a pathway to improving diagnostic consistency and empowering radiologists with actionable insights.
ResearchCodeAgent: An LLM Multi-Agent System for Automated Codification of Research Methodologies
Gandhi, Shubham, Shah, Dhruv, Patwardhan, Manasi, Vig, Lovekesh, Shroff, Gautam
In this paper we introduce ResearchCodeAgent, a novel multi-agent system leveraging large language models (LLMs) agents to automate the codification of research methodologies described in machine learning literature. The system bridges the gap between high-level research concepts and their practical implementation, allowing researchers auto-generating code of existing research papers for benchmarking or building on top-of existing methods specified in the literature with availability of partial or complete starter code. ResearchCodeAgent employs a flexible agent architecture with a comprehensive action suite, enabling context-aware interactions with the research environment. The system incorporates a dynamic planning mechanism, utilizing both short and long-term memory to adapt its approach iteratively. We evaluate ResearchCodeAgent on three distinct machine learning tasks with distinct task complexity and representing different parts of the ML pipeline: data augmentation, optimization, and data batching. Our results demonstrate the system's effectiveness and generalizability, with 46.9% of generated code being high-quality and error-free, and 25% showing performance improvements over baseline implementations. Empirical analysis shows an average reduction of 57.9% in coding time compared to manual implementation. We observe higher gains for more complex tasks. ResearchCodeAgent represents a significant step towards automating the research implementation process, potentially accelerating the pace of machine learning research.
Securing Agentic AI: A Comprehensive Threat Model and Mitigation Framework for Generative AI Agents
Narajala, Vineeth Sai, Narayan, Om
--As generative AI (GenAI) agents become more common in enterprise settings, they introduce security challenges that differ significantly from those posed by traditional systems. These agents aren't just LLMs--they reason, remember, and act, often with minimal human oversight. This paper introduces a comprehensive threat model tailored specifically for GenAI agents, focusing on how their autonomy, persistent memory access, complex reasoning, and tool integration create novel risks. This research work identifies 9 primary threats and organizes them across five key domains: cognitive architecture vulnerabilities, temporal persistence threats, operational execution vulnerabilities, trust boundary violations, and governance circumvention. These threats aren't just theoretical--they bring practical challenges such as delayed exploitability, cross-system propagation, cross system lateral movement, and subtle goal misalignments that are hard to detect with existing frameworks and standard approaches. T o help address this, the research work present two complementary frameworks: A TF AA (Advanced Threat Framework for Autonomous AI Agents), which organizes agent-specific risks, and SHIELD, a framework proposing practical mitigation strategies designed to reduce enterprise exposure. While this work builds on existing work in LLM and AI security, the focus is squarely on what makes agents different--and why those differences matter . Ultimately, this research argues that GenAI agents require a new lens for security. If we fail to adapt our threat models and defenses to account for their unique architecture and behavior, we risk turning a powerful new tool into a serious enterprise liability. Generative AI (GenAI) agents are emerging as a new category of enterprise technology.