Agent Societies
Settling Decentralized Multi-Agent Coordinated Exploration by Novelty Sharing
Jiang, Haobin, Ding, Ziluo, Lu, Zongqing
Exploration in decentralized cooperative multi-agent reinforcement learning faces two challenges. One is that the novelty of global states is unavailable, while the novelty of local observations is biased. The other is how agents can explore in a coordinated way. To address these challenges, we propose MACE, a simple yet effective multi-agent coordinated exploration method. By communicating only local novelty, agents can take into account other agents' local novelty to approximate the global novelty. Further, we newly introduce weighted mutual information to measure the influence of one agent's action on other agents' accumulated novelty. We convert it as an intrinsic reward in hindsight to encourage agents to exert more influence on other agents' exploration and boost coordinated exploration. Empirically, we show that MACE achieves superior performance in three multi-agent environments with sparse rewards.
Affordable Generative Agents
Yu, Yangbin, Zhang, Qin, Li, Junyou, Fu, Qiang, Ye, Deheng
The emergence of large language models (LLMs) has significantly advanced the simulation of believable interactive agents. However, the substantial cost on maintaining the prolonged agent interactions poses challenge over the deployment of believable LLM-based agents. Therefore, in this paper, we develop Affordable Generative Agents (AGA), a framework for enabling the generation of believable and low-cost interactions on both agent-environment and inter-agents levels. Specifically, for agent-environment interactions, we substitute repetitive LLM inferences with learned policies; while for inter-agent interactions, we model the social relationships between agents and compress auxiliary dialogue information. Extensive experiments on multiple environments show the effectiveness and efficiency of our proposed framework. Also, we delve into the mechanisms of emergent believable behaviors lying in LLM agents, demonstrating that agents can only generate finite behaviors in fixed environments, based upon which, we understand ways to facilitate emergent interaction behaviors. Our code is publicly available at: \url{https://github.com/AffordableGenerativeAgents/Affordable-Generative-Agents}.
The Power of Populations in Decentralized Bandits
Lazarsfeld, John, Alistarh, Dan
The multi-armed bandit problem, where a single learning agent chooses actions over a sequence of rounds in order to maximize its total reward, is among the most well-studied in online learning. Distributed, multi-agent variants of this problem have also been widely studied under various constraints; one particular such line of work is the cooperative multi-agent bandit setting, where agents are connected over a communication graph and play against a common bandit instance, choosing actions in parallel over T rounds. Each agent locally runs a bandit algorithm that may involve communication with neighbors, and the information exchanged can be used to determine an agent's future actions. This cooperative setting has been studied for both stochastic (Szorenyi et al., 2013; Landgren et al., 2016; Kolla et al., 2018; Martínez-Rubio et al., 2019) and non-stochastic bandits (Awerbuch & Kleinberg, 2008; Cesa-Bianchi et al., 2016; Bar-On & Mansour, 2019), where communication between agents has been shown to improve an agent's regret on average, compared to each agent locally running a centralized bandit algorithm without any communication. However, most prior works in this setting require that every agent communicate with all its neighbors in each round (as pointed out by Cesa-Bianchi et al. (2016), this resembles the LOCAL model of distributed computation (Linial, 1992)). When the underlying graph is dense, this volume of communication may be prohibitively large, which is a known bottleneck in many practical distributed machine learning settings (Alistarh et al., 2017; Koloskova et al., 2019a;b). In contrast, much less is known about cooperative multi-agent bandits in more lightweight decentralized models of distributed communication, such as the GOSSIP model (Boyd et al., 2006; Shah
High-Level, Collaborative Task Planning Grammar and Execution for Heterogeneous Agents
We propose a new multi-agent task grammar to encode collaborative tasks for a team of heterogeneous agents that can have overlapping capabilities. The grammar allows users to specify the relationship between agents and parts of the task without providing explicit assignments or constraints on the number of agents required. We develop a method to automatically find a team of agents and synthesize correct-by-construction control with synchronization policies to satisfy the task. We demonstrate the scalability of our approach through simulation and compare our method to existing task grammars that encode multi-agent tasks.
Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective
Ma, Qun, Xue, Xiao, Zhou, Deyu, Yu, Xiangning, Liu, Donghua, Zhang, Xuwen, Zhao, Zihan, Shen, Yifan, Ji, Peilin, Li, Juanjuan, Wang, Gang, Ma, Wanpeng
Computational experiments have emerged as a valuable method for studying complex systems, involving the algorithmization of counterfactuals. However, accurately representing real social systems in Agent-based Modeling (ABM) is challenging due to the diverse and intricate characteristics of humans, including bounded rationality and heterogeneity. To address this limitation, the integration of Large Language Models (LLMs) has been proposed, enabling agents to possess anthropomorphic abilities such as complex reasoning and autonomous learning. These agents, known as LLM-based Agent, offer the potential to enhance the anthropomorphism lacking in ABM. Nonetheless, the absence of explicit explainability in LLMs significantly hinders their application in the social sciences. Conversely, computational experiments excel in providing causal analysis of individual behaviors and complex phenomena. Thus, combining computational experiments with LLM-based Agent holds substantial research potential. This paper aims to present a comprehensive exploration of this fusion. Primarily, it outlines the historical development of agent structures and their evolution into artificial societies, emphasizing their importance in computational experiments. Then it elucidates the advantages that computational experiments and LLM-based Agents offer each other, considering the perspectives of LLM-based Agent for computational experiments and vice versa. Finally, this paper addresses the challenges and future trends in this research domain, offering guidance for subsequent related studies.
Attention Graph for Multi-Robot Social Navigation with Deep Reinforcement Learning
Escudie, Erwan, Matignon, Laetitia, Saraydaryan, Jacques
Learning robot navigation strategies among pedestrian is crucial for domain based applications. Combining perception, planning and prediction allows us to model the interactions between robots and pedestrians, resulting in impressive outcomes especially with recent approaches based on deep reinforcement learning (RL). However, these works do not consider multi-robot scenarios. In this paper, we present MultiSoc, a new method for learning multi-agent socially aware navigation strategies using RL. Inspired by recent works on multi-agent deep RL, our method leverages graph-based representation of agent interactions, combining the positions and fields of view of entities (pedestrians and agents). Each agent uses a model based on two Graph Neural Network combined with attention mechanisms. First an edge-selector produces a sparse graph, then a crowd coordinator applies node attention to produce a graph representing the influence of each entity on the others. This is incorporated into a model-free RL framework to learn multi-agent policies. We evaluate our approach on simulation and provide a series of experiments in a set of various conditions (number of agents / pedestrians). Empirical results show that our method learns faster than social navigation deep RL mono-agent techniques, and enables efficient multi-agent implicit coordination in challenging crowd navigation with multiple heterogeneous humans. Furthermore, by incorporating customizable meta-parameters, we can adjust the neighborhood density to take into account in our navigation strategy.
Traffic Flow Optimisation for Lifelong Multi-Agent Path Finding
Chen, Zhe, Harabor, Daniel, Li, Jiaoyang, Stuckey, Peter J.
Multi-Agent Path Finding (MAPF) is a fundamental problem in robotics that asks us to compute collision-free paths for a team of agents, all moving across a shared map. Although many works appear on this topic, all current algorithms struggle as the number of agents grows. The principal reason is that existing approaches typically plan free-flow optimal paths, which creates congestion. To tackle this issue, we propose a new approach for MAPF where agents are guided to their destination by following congestion-avoiding paths. We evaluate the idea in two large-scale settings: one-shot MAPF, where each agent has a single destination, and lifelong MAPF, where agents are continuously assigned new destinations. Empirically, we report large improvements in solution quality for one-short MAPF and in overall throughput for lifelong MAPF.
Scalable Mechanism Design for Multi-Agent Path Finding
Friedrich, Paul, Zhang, Yulun, Curry, Michael, Dierks, Ludwig, McAleer, Stephen, Li, Jiaoyang, Sandholm, Tuomas, Seuken, Sven
Multi-Agent Path Finding (MAPF) involves determining paths for multiple agents to travel simultaneously through a shared area toward particular goal locations. This problem is computationally complex, especially when dealing with large numbers of agents, as is common in realistic applications like autonomous vehicle coordination. Finding an optimal solution is often computationally infeasible, making the use of approximate algorithms essential. Adding to the complexity, agents might act in a self-interested and strategic way, possibly misrepresenting their goals to the MAPF algorithm if it benefits them. Although the field of mechanism design offers tools to align incentives, using these tools without careful consideration can fail when only having access to approximately optimal outcomes. Since approximations are crucial for scalable MAPF algorithms, this poses a significant challenge. In this work, we introduce the problem of scalable mechanism design for MAPF and propose three strategyproof mechanisms, two of which even use approximate MAPF algorithms. We test our mechanisms on realistic MAPF domains with problem sizes ranging from dozens to hundreds of agents. Our findings indicate that they improve welfare beyond a simple baseline.
ReLoki: Infrastructure-free Distributed Relative Localization using On-board UWB Antenna Arrays
Mathew, Joseph Prince, Nowzari, Cameron
Coordination of multi-robot systems require some form of localization between agents, but most methods today rely on some external infrastructure. Ultra Wide Band (UWB) sensing has gained popularity in relative localization applications, and we see many implementations that use cooperative agents augmenting UWB range measurements with other sensing modalities (e.g., ViO, IMU, VSLAM) for infrastructure-free relative localization. A lesser researched option is using Angle of Arrival (AoA) readings obtained from UWB Antenna pairs to perform relative localization. In this paper we present a UWB platform called ReLoki that can be used for ranging and AoA-based relative localization in~3D. ReLoki enables any message sent from a transmitting agent to be localized by using a Regular Tetrahedral Antenna Array (RTA). As a full scale proof of concept, we deploy ReLoki on a 3-robot system and compare its performance in terms of accuracy and speed with prior methods.
Norm Enforcement with a Soft Touch: Faster Emergence, Happier Agents
Tzeng, Sz-Ting, Ajmeri, Nirav, Singh, Munindar P.
A multiagent system can be viewed as a society of autonomous agents, whose interactions can be effectively regulated via social norms. In general, the norms of a society are not hardcoded but emerge from the agents' interactions. Specifically, how the agents in a society react to each other's behavior and respond to the reactions of others determines which norms emerge in the society. We think of these reactions by an agent to the satisfactory or unsatisfactory behaviors of another agent as communications from the first agent to the second agent. Understanding these communications is a kind of social intelligence: these communications provide natural drivers for norm emergence by pushing agents toward certain behaviors, which can become established as norms. Whereas it is well-known that sanctioning can lead to the emergence of norms, we posit that a broader kind of social intelligence can prove more effective in promoting cooperation in a multiagent system. Accordingly, we develop Nest, a framework that models social intelligence in the form of a wider variety of communications and understanding of them than in previous work. To evaluate Nest, we develop a simulated pandemic environment and conduct simulation experiments to compare Nest with baselines considering a combination of three kinds of social communication: sanction, tell, and hint. We find that societies formed of Nest agents achieve norms faster; moreover, Nest agents effectively avoid undesirable consequences, which are negative sanctions and deviation from goals, and yield higher satisfaction for themselves than baseline agents despite requiring only an equivalent amount of information.