Agents
Toward Real-World Cooperative and Competitive Soccer with Quadrupedal Robot Teams
Su, Zhi, Gao, Yuman, Lukas, Emily, Li, Yunfei, Cai, Jiaze, Tulbah, Faris, Gao, Fei, Yu, Chao, Li, Zhongyu, Wu, Yi, Sreenath, Koushil
Achieving coordinated teamwork among legged robots requires both fine-grained locomotion control and long-horizon strategic decision-making. Robot soccer offers a compelling testbed for this challenge, combining dynamic, competitive, and multi-agent interactions. In this work, we present a hierarchical multi-agent reinforcement learning (MARL) framework that enables fully autonomous and decentralized quadruped robot soccer. First, a set of highly dynamic low-level skills is trained for legged locomotion and ball manipulation, such as walking, dribbling, and kicking. On top of these, a high-level strategic planning policy is trained with Multi-Agent Proximal Policy Optimization (MAPPO) via Fictitious Self-Play (FSP). This learning framework allows agents to adapt to diverse opponent strategies and gives rise to sophisticated team behaviors, including coordinated passing, interception, and dynamic role allocation. With an extensive ablation study, the proposed learning method shows significant advantages in the cooperative and competitive multi-agent soccer game. We deploy the learned policies to real quadruped robots relying solely on onboard proprioception and decentralized localization, with the resulting system supporting autonomous robot-robot and robot-human soccer matches on indoor and outdoor soccer courts.
Morphologically Symmetric Reinforcement Learning for Ambidextrous Bimanual Manipulation
Li, Zechu, Jin, Yufeng, Apraez, Daniel Ordonez, Semini, Claudio, Liu, Puze, Chalvatzaki, Georgia
Humans naturally exhibit bilateral symmetry in their gross manipulation skills, effortlessly mirroring simple actions between left and right hands. Bimanual robots-which also feature bilateral symmetry-should similarly exploit this property to perform tasks with either hand. Unlike humans, who often favor a dominant hand for fine dexterous skills, robots should ideally execute ambidextrous manipulation with equal proficiency. To this end, we introduce SYMDEX (SYMmetric DEXterity), a reinforcement learning framework for ambidextrous bi-manipulation that leverages the robot's inherent bilateral symmetry as an inductive bias. SYMDEX decomposes complex bimanual manipulation tasks into per-hand subtasks and trains dedicated policies for each. By exploiting bilateral symmetry via equivariant neural networks, experience from one arm is inherently leveraged by the opposite arm. We then distill the subtask policies into a global ambidextrous policy that is independent of the hand-task assignment. We evaluate SYMDEX on six challenging simulated manipulation tasks and demonstrate successful real-world deployment on two of them. Our approach strongly outperforms baselines on complex task in which the left and right hands perform different roles. We further demonstrate SYMDEX's scalability by extending it to a four-arm manipulation setup, where our symmetry-aware policies enable effective multi-arm collaboration and coordination. Our results highlight how structural symmetry as inductive bias in policy learning enhances sample efficiency, robustness, and generalization across diverse dexterous manipulation tasks.
On Word-of-Mouth and Private-Prior Sequential Social Learning
Da Col, Andrea, Rojas, Cristian R., Krishnamurthy, Vikram
-- Social learning constitutes a fundamental framework for studying interactions among rational agents who observe each other's actions but lack direct access to individual beliefs. This paper investigates a specific social learning paradigm known as Word-of-Mouth (WoM), where a series of agents seeks to estimate the state of a dynamical system. The first agent receives noisy measurements of the state, while each subsequent agent relies solely on a degraded version of her predecessor's estimate. A defining feature of WoM is that the final agent's belief is publicly broadcast and subsequently adopted by all agents, in place of their own. We analyze this setting theoretically and through numerical simulations, noting that some agents benefit from using the belief of the last agent, while others experience performance deterioration.
Contemporary Agent Technology: LLM-Driven Advancements vs Classic Multi-Agent Systems
Bฤdicฤ, Costin, Bฤdicฤ, Amelia, Ganzha, Maria, Ivanoviฤ, Mirjana, Paprzycki, Marcin, Seliลteanu, Dan, Wrona, Zofia
This contribution provides our comprehensive reflection on the contemporary agent technology, with a particular focus on the advancements driven by Large Language Models (LLM) vs classic Multi - Agent Systems (MAS). It delves into the models, approaches, and characteristics that define these new systems. The paper emphasizes the critical analysis of how the recent developments relate to the foundational MAS, as articulated in the core academic literature. Finally, it identifies key challenges and promising future directions in this rapidly evolving domain.
GridMind: LLMs-Powered Agents for Power System Analysis and Operations
Jin, Hongwei, Kim, Kibaek, Kwon, Jonghwan
The complexity of traditional power system analysis workflows presents significant barriers to efficient decision-making in modern electric grids. This paper presents GridMind, a multi-agent AI system that integrates Large Language Models (LLMs) with deterministic engineering solvers to enable conversational scientific computing for power system analysis. The system employs specialized agents coordinating AC Optimal Power Flow and N-1 contingency analysis through natural language interfaces while maintaining numerical precision via function calls. GridMind addresses workflow integration, knowledge accessibility, context preservation, and expert decision-support augmentation. Experimental evaluation on IEEE test cases demonstrates that the proposed agentic framework consistently delivers correct solutions across all tested language models, with smaller LLMs achieving comparable analytical accuracy with reduced computational latency. This work establishes agentic AI as a viable paradigm for scientific computing, demonstrating how conversational interfaces can enhance accessibility while preserving numerical rigor essential for critical engineering applications.
Coral: A Unifying Abstraction Layer for Composable Robotics Software
Swanbeck, Steven, Pryor, Mitch
Despite the multitude of excellent software components and tools available in the robotics and broader software engineering communities, successful integration of software for robotic systems remains a time-consuming and challenging task for users of all knowledge and skill levels. And with robotics software often being built into tightly coupled, monolithic systems, even minor alterations to improve performance, adjust to changing task requirements, or deploy to new hardware can require significant engineering investment. To help solve this problem, this paper presents Coral, an abstraction layer for building, deploying, and coordinating independent software components that maximizes composability to allow for rapid system integration without modifying low-level code. Rather than replacing existing tools, Coral complements them by introducing a higher-level abstraction that constrains the integration process to semantically meaningful choices, reducing the configuration burden without limiting adaptability to diverse domains, systems, and tasks. We describe Coral in detail and demonstrate its utility in integrating software for scenarios of increasing complexity, including LiDAR-based SLAM and multi-robot corrosion mitigation tasks. By enabling practical composability in robotics software, Coral offers a scalable solution to a broad range of robotics system integration challenges, improving component reusability, system reconfigurability, and accessibility to both expert and non-expert users. We release Coral open source.
Hybrid Autonomy Framework for a Future Mars Science Helicopter
Di Pierno, Luca, Hewitt, Robert, Weiss, Stephan, Brockers, Roland
-- Autonomous aerial vehicles, such as NASA's Ingenuity, enable rapid planetary surface exploration beyond the reach of ground-based robots. Thus, NASA is studying a Mars Science Helicopter (MSH), an advanced concept capable of performing long-range science missions and autonomously navigating challenging Martian terrain. Given significant Earth-Mars communication delays and mission complexity, an advanced autonomy framework is required to ensure safe and efficient operation by continuously adapting behavior based on mission objectives and real-time conditions, without human intervention. This study presents a deterministic high-level control framework for aerial exploration, integrating a Finite State Machine (FSM) with Behavior Trees (BTs) to achieve a scalable, robust, and computationally efficient autonomy solution for critical scenarios like deep space exploration. In this paper we outline key capabilities of a possible MSH and detail the FSM-BT hybrid autonomy framework which orchestrates them to achieve the desired objectives. These inputs trigger state transitions or dynamically adjust behavior execution, enabling reactive and context-aware responses. The framework is middleware-agnostic, supporting integration with systems like F-Prime and extending beyond aerial robotics. Aerial vehicles have revolutionized planetary exploration by enabling access to scientifically valuable but hazardous terrain beyond the reach of ground-based robots. NASA's Ingenuity Mars helicopter demonstrated the feasibility of controlled flight on Mars, completing 72 successful flights despite the planet's thin atmosphere [1], [2]. However, Ingenuity was a technology demonstrator, designed primarily to validate powered flight rather than autonomously execute complex scientific missions.
Physics Supernova: AI Agent Matches Elite Gold Medalists at IPhO 2025
Qiu, Jiahao, Shi, Jingzhe, Juan, Xinzhe, Zhao, Zelin, Geng, Jiayi, Liu, Shilong, Wang, Hongru, Wu, Sanfeng, Wang, Mengdi
Physics provides fundamental laws that describe and predict the natural world. AI systems aspiring toward more general, real-world intelligence must therefore demonstrate strong physics problem-solving abilities: to formulate and apply physical laws for explaining and predicting physical processes. The International Physics Olympiad (IPhO)--the world's most prestigious physics competition--offers a rigorous benchmark for this purpose. We introduce Physics Supernova, an AI agent system with superior physics problem-solving abilities that match elite IPhO gold medalists. In IPhO 2025 theory problems, Physics Supernova attains 23.5/30 points, ranking 14th of 406 contestants and surpassing the median performance of human gold medalists. We extensively analyzed Physics Supernova's capabilities and flexibility across diverse physics tasks. These results show that principled tool integration within agent systems can deliver competitive improvements in solving challenging science problems. The codes are available at https://github.com/CharlesQ9/Physics-Supernova.
Throttling Web Agents Using Reasoning Gates
Kumar, Abhinav, Roh, Jaechul, Naseh, Ali, Houmansadr, Amir, Bagdasarian, Eugene
AI web agents use Internet resources at far greater speed, scale, and complexity -- changing how users and services interact. Deployed maliciously or erroneously, these agents could overload content providers. At the same time, web agents can bypass CAPTCHAs and other defenses by mimicking user behavior or flood authentication systems with fake accounts. Yet providers must protect their services and content from denial-of-service attacks and scraping by web agents. In this paper, we design a framework that imposes tunable costs on agents before providing access to resources; we call this Web Agent Throttling. We start by formalizing Throttling Gates as challenges issued to an agent that are asymmetric, scalable, robust, and compatible with any agent. Focusing on a common component -- the language model -- we require the agent to solve reasoning puzzles, thereby incurring excessive token-generation costs. However, we find that using existing puzzles, e.g., coding or math, as throttling gates fails to satisfy our properties. To address this, we introduce rebus-based Reasoning Gates, synthetic text puzzles that require multi-hop reasoning over world knowledge (thereby throttling an agent's model). We design a scalable generation and verification protocol for such reasoning gates. Our framework achieves computational asymmetry, i.e., the response-generation cost is 9.2x higher than the generation cost for SOTA models. We further deploy reasoning gates on a custom website and Model Context Protocol (MCP) servers and evaluate with real-world web agents. Finally, we discuss the limitations and environmental impact of real-world deployment of our framework.
Quantum game models for interaction-aware decision-making in automated driving
Essalmi, Karim, Garrido, Fernando, Nashashibi, Fawzi
Decision-making in automated driving must consider interactions with surrounding agents to be effective. However, traditional methods often neglect or oversimplify these interactions because they are difficult to model and solve, which can lead to overly conservative behavior of the ego vehicle. To address this gap, we propose two quantum game models, QG-U1 (Quantum Game - Unitary 1) and QG-G4 (Quantum Game - Gates 4), for interaction-aware decision-making. These models extend classical game theory by incorporating principles of quantum mechanics, such as superposition, interference, and entanglement. Specifically, QG-U1 and QG-G4 are designed for two-player games with two strategies per player and can be executed in real time on a standard computer without requiring quantum hardware. We evaluate both models in merging and roundabout scenarios and compare them with classical game-theoretic methods and baseline approaches (IDM, MOBIL, and a utility-based technique). Results show that QG-G4 achieves lower collision rates and higher success rates compared to baseline methods, while both quantum models yield higher expected payoffs than classical game approaches under certain parameter settings.