Agents
Supplementary Materials for " Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity " A Proofs of the Main Results
We first introduce some additional notations for convenience. Our proof mainly consists of the following steps: 1. Helper lemmas and a crude bound. See A.2, and more precisely, Lemmas A.9 and A.10. 3. Final bound for null -approximate NE value. See A.3. 4. Final bounds for null -NE policy. See A.5. 14 A.1 Important Lemmas We start with the component-wise error bounds.
Collaborative-Distilled Diffusion Models (CDDM) for Accelerated and Lightweight Trajectory Prediction
Wang, Bingzhang, Chen, Kehua, Wang, Yinhai
Abstract--Trajectory prediction is a fundamental task in Autonomous V ehicles (A Vs) and Intelligent Transportation Systems (ITS), supporting efficient motion planning and real-time traffic safety management. Diffusion models have recently demonstrated strong performance in probabilistic trajectory prediction, but their large model size and slow sampling process hinder real-world deployment. This paper proposes Collaborative-Distilled Diffusion Models (CDDM), a novel method for real-time and lightweight trajectory prediction. Built upon Collaborative Progressive Distillation (CPD), CDDM progressively transfers knowledge from a high-capacity teacher diffusion model to a lightweight student model, jointly reducing both the number of sampling steps and the model size across distillation iterations. A dual-signal regularized distillation loss is further introduced to incorporate guidance from both the teacher and ground-truth data, mitigating potential overfitting and ensuring robust performance. Extensive experiments on the ETH-UCY pedestrian benchmark and the nuScenes vehicle benchmark demonstrate that CDDM achieves state-of-the-art prediction accuracy. The well-distilled CDDM retains 96.2% and 95.5% of the baseline model's ADE and FDE performance on pedestrian trajectories, while requiring only 231K parameters and 4 or 2 sampling steps, corresponding to 161 compression, 31 acceleration, and 9 ms latency. Qualitative results further show that CDDM generates diverse and accurate trajectories under dynamic agent behaviors and complex social interactions. By bridging high-performing generative models with practical deployment constraints, CDDM enables resource-efficient probabilistic prediction for A Vs and ITS. As the rapid development of Autonomous V ehicles (A Vs) and Intelligent Transportation Systems (ITS), an increasing trend of research advancement in trajectory prediction has emerged. Trajectory prediction refers to the predictive estimation of traffic agents' future motion or states (e.g., vehicles, pedestrians) in complex surrounding environments.
Conflict-Based Search as a Protocol: A Multi-Agent Motion Planning Protocol for Heterogeneous Agents, Solvers, and Independent Tasks
Veerapaneni, Rishi, Tang, Alvin, He, Haodong, Zhao, Sophia, Shah, Viraj, Cen, Yidai, Ji, Ziteng, Olin, Gabriel, Arrizabalaga, Jon, Shaoul, Yorai, Li, Jiaoyang, Likhachev, Maxim
B. Algorithmically Heterogeneous MAMP T echniques Unlike algorithmically homogeneous MAMP methods, al-gorithmically heterogeneous MAMP methods do not require each agent run the same solver. To our surprise, we could not find any published work that addresses this problem setting. In particular, existing MAMP methods for heterogeneous teams focus on robots with different capabilities but use algorithmi-cally homogeneous solutions (e.g., [7], [11], [16]). On the other hand, existing multi-agent task planning/coordination methods focus on heterogeneous behaviors or task assignment and not on collision-free movement [27], [28]. Thus, part of this paper's goal is to introduce / bring attention to the Algorithmically Heterogeneous MAMP (AH-MAMP) problem setting. AH-MAMP tries to achieve collision-free motion planning for heterogeneous single-agent solvers without being able to modify the solvers. Solutions for AH-MAMP instead require designing multi-agent protocols with well-defined single-agent APIs, with the protocol/API abstraction enabling using heterogeneous single-agent solvers.
Reasoning-Aware Prompt Orchestration: A Foundation Model for Multi-Agent Language Model Coordination
The emergence of large language models has enabled sophisticated multi-agent systems, yet coordinating their reasoning capabilities through prompt engineering remains challenging. We present a theoretically-grounded framework for dynamic prompt orchestration that enhances reasoning across multiple specialized agents. This framework addresses three core challenges: logical consistency preservation during agent transitions, reasoning-aware prompt adaptation, and scalable coordination of distributed inference. Our approach formalizes agent states using prompt templates, reasoning context vectors, and capability matrices. We prove system convergence to stable coordination patterns when step sizes satisfy $ฮฑ< \frac{1}{2L}$ where $L$ is the Lipschitz constant of the state transition function. We implement this through a distributed architecture that dynamically routes reasoning tasks while maintaining semantic coherence. Experimental results on 1,000 synthetic multi-agent conversations demonstrate a 42% reduction in reasoning latency, a 23% improvement in logical consistency measured by ROUGE-L score, and an 89% success rate for task completion without context loss across agent transitions. Ablation studies identify the consensus mechanism as the primary performance driver, while revealing limitations: performance degrades beyond 10 agent transitions, and the system requires 76.5GB memory for 1,000 concurrent agents. These findings establish a new paradigm for scalable reasoning in multi-agent systems, providing theoretical foundations for understanding reasoning emergence across coordinated language models.
MAVUL: Multi-Agent Vulnerability Detection via Contextual Reasoning and Interactive Refinement
Li, Youpeng, Joshi, Kartik, Wang, Xinda, Wong, Eric
Most vulnerability detection (VD) methods are limited by inadequate contextual understanding, restrictive single-round interactions, and coarse-grained evaluations, resulting in undesired model performance and biased evaluation results. T o address these challenges, we propose MA VUL, a novel multi-agent VD system that integrates contextual reasoning and interactive refinement. Specifically, a vulnerability analyst agent is designed to flexibly leverage tool-using capabilities and contextual reasoning to achieve cross-procedural code understanding and effectively mine vulnerability patterns. Through iterative feedback and refined decision-making within cross-role agent interactions, the system achieves reliable reasoning and vulnerability prediction. Furthermore, MA VUL introduces multi-dimensional ground truth information for fine-grained evaluation, thereby enhancing evaluation accuracy and reliability. Extensive experiments conducted on a pairwise vulnerability dataset demonstrate MA VUL's superior performance. Our findings indicate that MA VUL significantly outperforms existing multi-agent systems with over 62% higher pairwise accuracy and single-agent systems with over 600% higher average performance. The system's effectiveness is markedly improved with increased communication rounds between the vulnerability analyst agent and the security architect agent, underscoring the importance of contextual reasoning in tracing vulnerability flows and the crucial feedback role. Additionally, the integrated evaluation agent serves as a critical, unbiased judge, ensuring a more accurate and reliable estimation of the system's real-world applicability by preventing misleading binary comparisons.
CORTEX: Collaborative LLM Agents for High-Stakes Alert Triage
Wei, Bowen, Tay, Yuan Shen, Liu, Howard, Pan, Jinhao, Luo, Kun, Zhu, Ziwei, Jordan, Chris
Security Operations Centers (SOCs) are overwhelmed by tens of thousands of daily alerts, with only a small fraction corresponding to genuine attacks. This overload creates alert fatigue, leading to overlooked threats and analyst burnout. Classical detection pipelines are brittle and context-poor, while recent LLM-based approaches typically rely on a single model to interpret logs, retrieve context, and adjudicate alerts end-to-end -- an approach that struggles with noisy enterprise data and offers limited transparency. We propose CORTEX, a multi-agent LLM architecture for high-stakes alert triage in which specialized agents collaborate over real evidence: a behavior-analysis agent inspects activity sequences, evidence-gathering agents query external systems, and a reasoning agent synthesizes findings into an auditable decision. To support training and evaluation, we release a dataset of fine-grained SOC investigations from production environments, capturing step-by-step analyst actions and linked tool outputs. Across diverse enterprise scenarios, CORTEX substantially reduces false positives and improves investigation quality over state-of-the-art single-agent LLMs.
A Hierarchical Agentic Framework for Autonomous Drone-Based Visual Inspection
Herron, Ethan, Lee, Xian Yeow, Sin, Gregory, Diaz, Teresa Gonzalez, Farahat, Ahmed, Gupta, Chetan
Autonomous inspection systems are essential for ensuring the performance and longevity of industrial assets. Recently, agentic frameworks have demonstrated significant potential for automating inspection workflows but have been limited to digital tasks. Their application to physical assets in real-world environments, however, remains underexplored. In this work, our contributions are two-fold: first, we propose a hierarchical agentic framework for autonomous drone control, and second, a reasoning methodology for individual function executions which we refer to as ReActEval. Our framework focuses on visual inspection tasks in indoor industrial settings, such as interpreting industrial readouts or inspecting equipment. It employs a multi-agent system comprising a head agent and multiple worker agents, each controlling a single drone. The head agent performs high-level planning and evaluates outcomes, while worker agents implement ReActEval to reason over and execute low-level actions. Operating entirely in natural language, ReActEval follows a plan, reason, act, evaluate cycle, enabling drones to handle tasks ranging from simple navigation (e.g., flying forward 10 meters and land) to complex high-level tasks (e.g., locating and reading a pressure gauge). The evaluation phase serves as a feedback and/or replanning stage, ensuring actions align with user objectives while preventing undesirable outcomes. We evaluate the framework in a simulated environment with two worker agents, assessing performance qualitatively and quantitatively based on task completion across varying complexity levels and workflow efficiency. By leveraging natural language processing for agent communication, our approach offers a novel, flexible, and user-accessible alternative to traditional drone-based solutions, enabling autonomous problem-solving for industrial inspection without extensive user intervention.