Agents
Near Optimal Best Arm Identification for Clustered Bandits
Yash, null, Karamchandani, Nikhil, Ghosh, Avishek
This work investigates the problem of best arm identification for multi-agent multi-armed bandits. We consider $N$ agents grouped into $M$ clusters, where each cluster solves a stochastic bandit problem. The mapping between agents and bandits is a priori unknown. Each bandit is associated with $K$ arms, and the goal is to identify the best arm for each agent under a $ฮด$-probably correct ($ฮด$-PC) framework, while minimizing sample complexity and communication overhead. We propose two novel algorithms: Clustering then Best Arm Identification (Cl-BAI) and Best Arm Identification then Clustering (BAI-Cl). Cl-BAI uses a two-phase approach that first clusters agents based on the bandit problems they are learning, followed by identifying the best arm for each cluster. BAI-Cl reverses the sequence by identifying the best arms first and then clustering agents accordingly. Both algorithms leverage the successive elimination framework to ensure computational efficiency and high accuracy. We establish $ฮด$-PC guarantees for both methods, derive bounds on their sample complexity, and provide a lower bound for this problem class. Moreover, when $M$ is small (a constant), we show that the sample complexity of a variant of BAI-Cl is minimax optimal in an order-wise sense. Experiments on synthetic and real-world datasets (MovieLens, Yelp) demonstrate the superior performance of the proposed algorithms in terms of sample and communication efficiency, particularly in settings where $M \ll N$.
CartoAgent: a multimodal large language model-powered multi-agent cartographic framework for map style transfer and evaluation
Wang, Chenglong, Kang, Yuhao, Gong, Zhaoya, Zhao, Pengjun, Feng, Yu, Zhang, Wenjia, Li, Ge
The rapid development of generative artificial intelligence (GenAI) presents new opportunities to advance the cartographic process. Previous studies have either overlooked the artistic aspects of maps or faced challenges in creating both accurate and informative maps. In this study, we propose CartoAgent, a novel multi-agent cartographic framework powered by multimodal large language models (MLLMs). This framework simulates three key stages in cartographic practice: preparation, map design, and evaluation. At each stage, different MLLMs act as agents with distinct roles to collaborate, discuss, and utilize tools for specific purposes. In particular, CartoAgent leverages MLLMs' visual aesthetic capability and world knowledge to generate maps that are both visually appealing and informative. By separating style from geographic data, it can focus on designing stylesheets without modifying the vector-based data, thereby ensuring geographic accuracy. We applied CartoAgent to a specific task centered on map restyling-namely, map style transfer and evaluation. The effectiveness of this framework was validated through extensive experiments and a human evaluation study. CartoAgent can be extended to support a variety of cartographic design decisions and inform future integrations of GenAI in cartography.
Security of Internet of Agents: Attacks and Countermeasures
Wang, Yuntao, Pan, Yanghe, Guo, Shaolong, Su, Zhou
With the rise of large language and vision-language models, AI agents have evolved into autonomous, interactive systems capable of perception, reasoning, and decision-making. As they proliferate across virtual and physical domains, the Internet of Agents (IoA) has emerged as a key infrastructure for enabling scalable and secure coordination among heterogeneous agents. This survey offers a comprehensive examination of the security and privacy landscape in IoA systems. We begin by outlining the IoA architecture and its distinct vulnerabilities compared to traditional networks, focusing on four critical aspects: identity authentication threats, cross-agent trust issues, embodied security, and privacy risks. We then review existing and emerging defense mechanisms and highlight persistent challenges. Finally, we identify open research directions to advance the development of resilient and privacy-preserving IoA ecosystems.
Intelligent Product 3.0: Decentralised AI Agents and Web3 Intelligence Standards
Wong, Alex C. Y., McFarlane, Duncan, Ellarby, C., Lee, M., Kuok, M.
The "Intelligent Product" was first introduced as a way to embed intelligence within everyday objects, enabling them to assess and influence their own destiny (Wong et al., 2002). The concept built on the technologies and infrastructure being developed at the Auto-ID Center (Sarma et al., 2000), notably the Electronic Product Code (EPC) for Radio Frequency Identification (RFID), along with related standards for storing and communicating product data. However, this predated blockchain, while the Internet of Things (IoT), a term also coined at the Auto-ID Center by Kevin Ashton (Ashton, 2009), and the Internet itself were still in their infancy as communication platforms. Embedded AI, primarily implemented through software agents, remained largely a research tool at the time. As a result, truly autonomous and fully intelligent products were not attainable until recent innovations in blockchain, Web3, and artificial intelligence. This paper revisits the original vision and specification of the Intelligent Product, charts its refinement over the years, and demonstrates how these emerging capabilities have paved the way for Intelligent Product 3.0. 1
Air-Ground Collaboration for Language-Specified Missions in Unknown Environments
Cladera, Fernando, Ravichandran, Zachary, Hughes, Jason, Murali, Varun, Nieto-Granda, Carlos, Hsieh, M. Ani, Pappas, George J., Taylor, Camillo J., Kumar, Vijay
As autonomous robotic systems become increasingly mature, users will want to specify missions at the level of intent rather than in low-level detail. Language is an expressive and intuitive medium for such mission specification. However, realizing language-guided robotic teams requires overcoming significant technical hurdles. Interpreting and realizing language-specified missions requires advanced semantic reasoning. Successful heterogeneous robots must effectively coordinate actions and share information across varying viewpoints. Additionally, communication between robots is typically intermittent, necessitating robust strategies that leverage communication opportunities to maintain coordination and achieve mission objectives. In this work, we present a first-of-its-kind system where an unmanned aerial vehicle (UAV) and an unmanned ground vehicle (UGV) are able to collaboratively accomplish missions specified in natural language while reacting to changes in specification on the fly. We leverage a Large Language Model (LLM)-enabled planner to reason over semantic-metric maps that are built online and opportunistically shared between an aerial and a ground robot. We consider task-driven navigation in urban and rural areas. Our system must infer mission-relevant semantics and actively acquire information via semantic mapping. In both ground and air-ground teaming experiments, we demonstrate our system on seven different natural-language specifications at up to kilometer-scale navigation.
Multi-Agent Reinforcement Learning Simulation for Environmental Policy Synthesis
Rudd-Jones, James, Musolesi, Mirco, Pรฉrez-Ortiz, Marรญa
Climate policy development faces significant challenges due to deep uncertainty, complex system dynamics, and competing stakeholder interests. Climate simulation methods, such as Earth System Models, have become valuable tools for policy exploration. However, their typical use is for evaluating potential polices, rather than directly synthesizing them. The problem can be inverted to optimize for policy pathways, but the traditional optimization approaches often struggle with non-linear dynamics, heterogeneous agents, and comprehensive uncertainty quantification. We propose a framework for augmenting climate simulations with Multi-Agent Reinforcement Learning (MARL) to address these limitations. We identify key challenges at the interface between climate simulations and the application of MARL in the context of policy synthesis, including reward definition, scalability with increasing agents and state spaces, uncertainty propagation across linked systems, and solution validation. Additionally, we discuss challenges in making MARL-derived solutions interpretable and useful for policy-makers. Our framework provides a foundation for more sophisticated climate policy exploration while acknowledging important limitations and areas for future research.
Multi-modal Synthetic Data Training and Model Collapse: Insights from VLMs and Diffusion Models
Hu, Zizhao, Rostami, Mohammad, Thomason, Jesse
Recent research has highlighted the risk of generative model collapse, where performance progressively degrades when continually trained on self-generated data. However, existing exploration on model collapse is limited to single, unimodal models, limiting our understanding in more realistic scenarios, such as diverse multi-modal AI agents interacting autonomously through synthetic data and continually evolving. We expand the synthetic data training and model collapse study to multi-modal vision-language generative systems, such as vision-language models (VLMs) and text-to-image diffusion models, as well as recursive generate-train loops with multiple models. We find that model collapse, previously observed in single-modality generative models, exhibits distinct characteristics in the multi-modal context, such as improved vision-language alignment and increased variance in VLM image-captioning task. Additionally, we find that general approaches such as increased decoding budgets, greater model diversity, and relabeling with frozen models can effectively mitigate model collapse. Our findings provide initial insights and practical guidelines for reducing the risk of model collapse in self-improving multi-agent AI systems and curating robust multi-modal synthetic datasets.
Enhancing Aerial Combat Tactics through Hierarchical Multi-Agent Reinforcement Learning
Selmonaj, Ardian, Szehr, Oleg, Del Rio, Giacomo, Antonucci, Alessandro, Schneider, Adrian, Rรผegsegger, Michael
This is motivated by the strong performance of RL agents in finding effective Courses of Action (CoA) across a wide range of environments, including combinatorial settings such as Chess or Go [1], real-time continuous control tasks found in arcade video games [2], and scenarios that combine control with strategic decision-making, as seen in modern wargames [3]. The application of RL in the context of air combat comes with a number of specific challenges. Those include structural properties of the simulation scenario, such as the complexity of the individual units and their flight dynamics, the exponential size of the combined state and action spaces, the depth of the planning horizon, the presence of stochasticity and imperfect information, etc. Overall the size of the game tree (i.e., the set of possible CoAs) in strategic games and defense scenarios appears vast and beyond the access of straightforward search. Furthermore, real-world operations involve the simultaneous maneuverings of individual units, but also be- ing mindful of the strategic positions and global mission planning. Training policies that integrate real-time control at the troop level with high-level mission planning at the commander level is challenging, as these tasks inherently demand distinct system requirements, algorithmic approaches, and training configurations.
Design of a Formation Control System to Assist Human Operators in Flying a Swarm of Robotic Blimps
Wu, Tianfu, Fu, Jiaqi, Meng, Wugang, Cho, Sungjin, Zhan, Huanzhe, Zhang, Fumin
Formation control is essential for swarm robotics, enabling coordinated behavior in complex environments. In this paper, we introduce a novel formation control system for an indoor blimp swarm using a specialized leader-follower approach enhanced with a dynamic leader-switching mechanism. This strategy allows any blimp to take on the leader role, distributing maneuvering demands across the swarm and enhancing overall formation stability. Only the leader blimp is manually controlled by a human operator, while follower blimps use onboard monocular cameras and a laser altimeter for relative position and altitude estimation. A leader-switching scheme is proposed to assist the human operator to maintain stability of the swarm, especially when a sharp turn is performed. Experimental results confirm that the leader-switching mechanism effectively maintains stable formations and adapts to dynamic indoor environments while assisting human operator.
Streaming Multi-agent Pathfinding
Tang, Mingkai, Gan, Lu, Zhang, Kaichen
The task of the multi-agent pathfinding (MAPF) problem is to navigate a team of agents from their start point to the goal points. However, this setup is unsuitable in the assembly line scenario, which is periodic with a long working hour. To address this issue, the study formalizes the streaming MAPF (S-MAPF) problem, which assumes that the agents in the same agent stream have a periodic start time and share the same action sequence. The proposed solution, Agent Stream Conflict-Based Search (AS-CBS), is designed to tackle this problem by incorporating a cyclic vertex/edge constraint to handle conflicts. Additionally, this work explores the potential usage of the disjoint splitting strategy within ASCBS. Experimental results indicate that ASCBS surpasses traditional MAPF solvers in terms of run-time for scenarios with prolonged working hours.