Agents
DiRAC - Distributed Robot Awareness and Consensus
Gopan, Uday, Kulkarni, Manjari, S, Lakshasri, Mittal, Kashish, Radhakrishna, Sriram, Naskar, Aditya, DL, Rameshwar
Abstract--DiRAC is a scalable, distributed framework designed to enable efficient task assignment and path planning in very large robotic swarms. It introduces a novel zone-partitioned architecture with dynamically elected leaders and a tick-synchronized consensus protocol that yields strong consistency and deterministic outcomes. For path planning, DiRAC uses a novel algorithm, a force-based decentralized planner for real-time collision resolution. V alidated within ROS 2 middleware through preliminary simulation, DiRAC demonstrates architectural scalability and modular efficiency in simulated warehouse environments, laying the groundwork for real-world deployment in large-scale industrial and logistics domains. Index T erms--Swarm Robotics, Multi-Agent Systems, Distributed Consensus, T ask Assignment, Path Planning, Distributed Algorithms, Robot Coordination, Scalable Systems, Leader Election, Fault T olerance, Cooperative Control, Decentralized Control, ROS 2 Middleware.
FinSight: Towards Real-World Financial Deep Research
Jin, Jiajie, Zhang, Yuyao, Xu, Yimeng, Qian, Hongjin, Zhu, Yutao, Dou, Zhicheng
Generating professional financial reports is a labor-intensive and intellectually demanding process that current AI systems struggle to fully automate. To address this challenge, we introduce FinSight (Financial InSight), a novel multi agent framework for producing high-quality, multimodal financial reports. The foundation of FinSight is the Code Agent with Variable Memory (CAVM) architecture, which unifies external data, designed tools, and agents into a programmable variable space, enabling flexible data collection, analysis and report generation through executable code. To ensure professional-grade visualization, we propose an Iterative Vision-Enhanced Mechanism that progressively refines raw visual outputs into polished financial charts. Furthermore, a two stage Writing Framework expands concise Chain-of-Analysis segments into coherent, citation-aware, and multimodal reports, ensuring both analytical depth and structural consistency. Experiments on various company and industry-level tasks demonstrate that FinSight significantly outperforms all baselines, including leading deep research systems in terms of factual accuracy, analytical depth, and presentation quality, demonstrating a clear path toward generating reports that approach human-expert quality.
Enhancing Language Agent Strategic Reasoning through Self-Play in Adversarial Games
Zhang, Yikai, Rong, Ye, Yuan, Siyu, Chen, Jiangjie, Xie, Jian, Xiao, Yanghua
Existing language agents often encounter difficulties in dynamic adversarial games due to poor strategic reasoning. To mitigate this limitation, a promising approach is to allow agents to learn from game interactions automatically, without relying on costly expert-labeled data. Unlike static environments where agents receive fixed feedback or rewards, selecting appropriate opponents in dynamic adversarial games can significantly impact learning performance. However, the discussion of opponents in adversarial environments remains an area under exploration. In this paper, we propose a Step-level poliCy Optimization method through Play-And-Learn, SCO-PAL. Leveraging SCO-PAL, we conduct a detailed analysis of opponent selection by setting opponents at different levels and find that self-play is the most effective way to improve strategic reasoning in such adversarial environments. Utilizing SCO-PAL with self-play, we increase the average win rate against four opponents by approximately 30% compared to baselines and achieve a 54.76% win rate against GPT-4 in six adversarial games.
Prompt Optimization via Retrieved Reasoning Assets and Multi-Agent Analysis
Seo, Wonduk, Lee, Juhyeon, Koh, Junseo, An, Hyunjin, Park, Jian, Lee, Seunghyun, Chen, Haihua, Bu, Yi
Prompt optimization has emerged as an effective alternative to retraining for improving the performance of Large Language Models (LLMs). However, most existing approaches treat evaluation as a black box, relying solely on numerical scores while offering limited insight into why a prompt succeeds or fails. They also depend heavily on trial-and-error refinements, which are difficult to interpret and control. In this paper, we introduce MA-SAPO, a Multi-Agent framework for Score-Aware Prompt Optimization. Compared to prior methods, MA-SAPO explicitly couples evaluation outcomes with structured reasoning to guide systematic edits. The framework specifically consists of two stages: during the Reasoning Phase, agents collaboratively explain metric scores, diagnose weaknesses, and synthesize targeted refinements that are stored as reusable reasoning assets; during the Test Phase, agents retrieve these assets to analyze optimized prompts and apply only evidence-grounded edits. By turning evaluation signals into interpretable reasoning chains, MA-SAPO produces prompt refinements that are more transparent, auditable, and controllable. Experiments on the HelpSteer1/2 benchmarks demonstrate consistent improvements over single-pass prompting, retrieval-augmented baselines, and prior multi-agent strategies, validating the effectiveness of our approach.
Ripple Effect Protocol: Coordinating Agent Populations
Chopra, Ayush, Sharma, Aman, Ahmad, Feroz, Muscariello, Luca, Pandey, Vijoy, Raskar, Ramesh
Modern AI agents can exchange messages using protocols such as A2A and ACP, yet these mechanisms emphasize communication over coordination. As agent populations grow, this limitation produces brittle collective behavior, where individually smart agents converge on poor group outcomes. We introduce the Ripple Effect Protocol (REP), a coordination protocol in which agents share not only their decisions but also lightweight sensitivities - signals expressing how their choices would change if key environmental variables shifted. These sensitivities ripple through local networks, enabling groups to align faster and more stably than with agent-centric communication alone. We formalize REP's protocol specification, separating required message schemas from optional aggregation rules, and evaluate it across scenarios with varying incentives and network topologies. Benchmarks across three domains: (i) supply chain cascades (Beer Game), (ii) preference aggregation in sparse networks (Movie Scheduling), and (iii) sustainable resource allocation (Fishbanks) show that REP improves coordination accuracy and efficiency over A2A by 41 to 100%, while flexibly handling multimodal sensitivity signals from LLMs. By making coordination a protocol-level capability, REP provides scalable infrastructure for the emerging Internet of Agents
What Questions Should Robots Be Able to Answer? A Dataset of User Questions for Explainable Robotics
Wachowiak, Lennart, Coles, Andrew, Canal, Gerard, Celiktutan, Oya
With the growing use of large language models and conversational interfaces in human-robot interaction, robots' ability to answer user questions is more important than ever. We therefore introduce a dataset of 1,893 user questions for household robots, collected from 100 participants and organized into 12 categories and 70 subcategories. Most work in explainable robotics focuses on why-questions. In contrast, our dataset provides a wide variety of questions, from questions about simple execution details to questions about how the robot would act in hypothetical scenarios -- thus giving roboticists valuable insights into what questions their robot needs to be able to answer. To collect the dataset, we created 15 video stimuli and 7 text stimuli, depicting robots performing varied household tasks. We then asked participants on Prolific what questions they would want to ask the robot in each portrayed situation. In the final dataset, the most frequent categories are questions about task execution details (22.5%), the robot's capabilities (12.7%), and performance assessments (11.3%). Although questions about how robots would handle potentially difficult scenarios and ensure correct behavior are less frequent, users rank them as the most important for robots to be able to answer. Moreover, we find that users who identify as novices in robotics ask different questions than more experienced users. Novices are more likely to inquire about simple facts, such as what the robot did or the current state of the environment. As robots enter environments shared with humans and language becomes central to giving instructions and interaction, this dataset provides a valuable foundation for (i) identifying the information robots need to log and expose to conversational interfaces, (ii) benchmarking question-answering modules, and (iii) designing explanation strategies that align with user expectations.
WEBSERV: A Browser-Server Environment for Efficient Training of Reinforcement Learning-based Web Agents at Scale
Lu, Yuxuan, Huang, Jing, Liu, Hui, Gesi, Jiri, Han, Yan, Fu, Shihan, Zheng, Tianqi, Wang, Dakuo
Training and evaluation of Reinforcement Learning (RL) web agents have gained increasing attention, yet a scalable and efficient environment that couples realistic and robust browser-side interaction with controllable server-side state at scale is still missing. Existing environments tend to have one or more of the following issues: they overwhelm policy models with excessive and noisy context; they perform actions non-deterministically without waiting for the UI or network to stabilize; or they cannot scale isolated client-server containers effectively for parallel RL rollouts. We propose WEBSERV, an environment that includes 1) a compact, site-agnostic browser environment that balances context and action complexity, and 2) a scalable RL environment via efficient launching and resetting web-servers to enable scalable RL training and evaluation. We evaluate WEBSERV on the shopping CMS and Gitlab tasks in WebArena, achieving state-of-the-art single-prompt success rates while cutting launch latency by ~5x and storage need by ~240x, with a comparable memory footprint, enabling 200+ concurrent containers on a single host.
Heterogeneous Multi-Agent Task-Assignment with Uncertain Execution Times and Preferences
Wei, Qinshuang, Srivastava, Vaibhav, Gupta, Vijay
While sequential task assignment for a single agent has been widely studied, such problems in a multi-agent setting, where the agents have heterogeneous task preferences or capabilities, remain less well-characterized. We study a multi-agent task assignment problem where a central planner assigns recurring tasks to multiple members of a team over a finite time horizon. For any given task, the members have heterogeneous capabilities in terms of task completion times, task resource consumption (which can model variables such as energy or attention), and preferences in terms of the rewards they collect upon task completion. We assume that the reward, execution time, and resource consumption for each member to complete any task are stochastic with unknown distributions. The goal of the planner is to maximize the total expected reward that the team receives over the problem horizon while ensuring that the resource consumption required for any assigned task is within the capability of the agent. We propose and analyze a bandit algorithm for this problem. Since the bandit algorithm relies on solving an optimal task assignment problem repeatedly, we analyze the achievable regret in two cases: when we can solve the optimal task assignment exactly and when we can solve it only approximately.
Agentic AI for Ultra-Modern Networks: Multi-Agent Framework for RAN Autonomy and Assurance
Singh, Sukhdeep, Bhat, Avinash, M, Shweta, Singh, Subhash K, Hong, Moonki, K, Madhan Raj, Sithamparanathan, Kandeepan, Khowaja, Sunder A., Dev, Kapal
Traditional O - RAN control loops rely heavily on RIC - based orchestration, which centralizes intelligence and exposes the system to risks such as policy conflicts, data drift, and unsafe actions under unforeseen conditions. In this work, we argue that the future of autonomous networks lies in a multi - agentic architecture, where specialized agents collaborate to perform data collection, model training, prediction, policy generation, verification, deployment, and assurance. By replacing tightly - coupled centralized RIC - based workflows with distributed agents, the framework achieves autonomy, resilience, explainability, and system - wide safety. To substantiate this vision, we design and evaluate a traffic steering use case under surge and drift conditions. Results across four KPIs: RRC connected users, IP throughput, PRB utilization, and SINR, demonstrate that a naive predictor - driven deployment improves local KPIs but destabilizes neighbors, whereas the agentic system blocks unsafe policies, preserving global network health. This study highlights multi - agent architectures as a credible foundation for trustworthy AI - driven autonomy in next - generation RANs.
Interpretable RNA-Seq Clustering with an LLM-Based Agentic Evidence-Grounded Framework
Hossain, Elias, Shoeibi, Mehrdad, Garibay, Ivan, Yousefi, Niloofar
While clustering methods such as spectral clustering and K-means effectively group genes by expression similarity, downstream interpretation is typically performed using enrichment-based statistics. These approaches provide high-level functional summaries but often fail to yield cluster-specific mechanistic insight or explicit links to supporting literature. As a result, biological interpretation frequently relies on manual curation, limiting reproducibility and scalability. Large language models (LLMs) have recently emerged as powerful tools for biomedical text mining and knowledge synthesis. Although LLMs can generate fluent biological narratives, they are optimized for linguistic coherence rather than evidential accountability. When applied directly to transcriptomic interpretation, they may produce plausible but unverifiable statements, omit explicit citations, or hallucinate unsupported claims. While retrieval-augmented and agentic systems partially address this issue, systematic verification and critic-based validation remain underexplored. This limitation is particularly consequential for antimicrobial resistance research in Salmonella enterica, a major foodborne pathogen responsible for substantial global morbidity.