Agent Societies
The Impact of Big Five Personality Traits on AI Agent Decision-Making in Public Spaces: A Social Simulation Study
This study investigates how the Big Five personality traits influence decision-making processes in AI agents within public spaces. Using AgentVerse framework and GPT-3.5-turbo, we simulated interactions among 10 AI agents, each embodying different dimensions of the Big Five personality traits, in a classroom environment responding to misinformation. The experiment assessed both public expressions ([Speak]) and private thoughts ([Think]) of agents, revealing significant correlations between personality traits and decision-making patterns. Results demonstrate that Openness to Experience had the strongest impact on information acceptance, with curious agents showing high acceptance rates and cautious agents displaying strong skepticism. Extraversion and Conscientiousness also showed notable influence on decision-making, while Neuroticism and Agreeableness exhibited more balanced responses. Additionally, we observed significant discrepancies between public expressions and private thoughts, particularly in agents with friendly and extroverted personalities, suggesting that social context influences decision-making behavior. Our findings contribute to understanding how personality traits shape AI agent behavior in social settings and have implications for developing more nuanced and context-aware AI systems.
Networked Agents in the Dark: Team Value Learning under Partial Observability
Varela, Guilherme S., Sardinha, Alberto, Melo, Francisco S.
We propose a novel cooperative multi-agent reinforcement learning (MARL) approach for networked agents. In contrast to previous methods that rely on complete state information or joint observations, our agents must learn how to reach shared objectives under partial observability. During training, they collect individual rewards and approximate a team value function through local communication, resulting in cooperative behavior. To describe our problem, we introduce the networked dynamic partially observable Markov game framework, where agents communicate over a switching topology communication network. Our distributed method, DNA-MARL, uses a consensus mechanism for local communication and gradient descent for local computation. DNA-MARL increases the range of the possible applications of networked agents, being well-suited for real world domains that impose privacy and where the messages may not reach their recipients. We evaluate DNA-MARL across benchmark MARL scenarios. Our results highlight the superior performance of DNA-MARL over previous methods.
Ensuring Truthfulness in Distributed Aggregative Optimization
Chen, Ziqin, Egerstedt, Magnus, Wang, Yongqiang
--Distributed aggregative optimization methods are gaining increased traction due to their ability to address cooperative control and optimization problems, where the objective function of each agent depends not only on its own decision variable but also on the aggregation of other agents' decision variables. Nevertheless, existing distributed aggregative optimization methods implicitly assume all agents to be truthful in information sharing, which can be unrealistic in real-world scenarios, where agents may act selfishly or strategically. In fact, an opportunistic agent may deceptively share false information in its own favor to minimize its own loss, which, however, will compromise the network-level global performance. T o solve this issue, we propose a new distributed aggregative optimization algorithm that can ensure truthfulness of agents and convergence performance. T o the best of our knowledge, this is the first algorithm that ensures truthfulness in a fully distributed setting, where no "centralized" aggregator exists to collect private information/decision variables from participating agents. We systematically characterize the convergence rate of our algorithm under nonconvex/convex/strongly convex objective functions, which generalizes existing distributed aggregative optimization results that only focus on convex objective functions. We also rigorously quantify the tradeoff between convergence performance and the level of enabled truthfulness under different convexity conditions. Numerical simulations using distributed charging of electric vehicles confirm the efficacy of our algorithm. Index T erms --Distributed aggregative optimization, joint differential privacy, truthfulness. Recently, there has been a surge of interest in distributed optimization which underpins numerous applications in cooperative control [1], [2], signal processing [3], and machine learning [4]. In distributed optimization, a group of agents cooperatively learns a common decision variable that minimizes a global objective function that is the sum of individual agents' objective functions. The work was supported in part by the National Science Foundation under Grants ECCS-1912702, CCF-2106293, CCF-2215088, CNS-2219487, and CCF-2334449. Ziqin Chen and Y ongqiang Wang are with the Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634 USA and Magnus Egerstedt is with the Department of Electrical Engineering and Computer Science, University of California, Irvine, Irvine, CA 92697 USA. To solve problem (1), several gradient-tracking-based algorithms have been proposed for strongly convex objective functions [5]-[11] and convex objective functions [12]-[15]. Recently, some results have also been reported for nonconvex objective functions [16], [17].
AdaSociety: An Adaptive Environment with Social Structures for Multi-Agent Decision-Making
Huang, Yizhe, Wang, Xingbo, Liu, Hao, Kong, Fanqi, Qin, Aoyang, Tang, Min, Zhu, Song-Chun, Bi, Mingjie, Qi, Siyuan, Feng, Xue
Traditional interactive environments limit agents' intelligence growth with fixed tasks. Recently, single-agent environments address this by generating new tasks based on agent actions, enhancing task diversity. We consider the decision-making problem in multi-agent settings, where tasks are further influenced by social connections, affecting rewards and information access. However, existing multi-agent environments lack a combination of adaptive physical surroundings and social connections, hindering the learning of intelligent behaviors. To address this, we introduce AdaSociety, a customizable multi-agent environment featuring expanding state and action spaces, alongside explicit and alterable social structures. As agents progress, the environment adaptively generates new tasks with social structures for agents to undertake. In AdaSociety, we develop three mini-games showcasing distinct social structures and tasks. Initial results demonstrate that specific social structures can promote both individual and collective benefits, though current reinforcement learning and LLM-based algorithms show limited effectiveness in leveraging social structures to enhance performance. Overall, AdaSociety serves as a valuable research platform for exploring intelligence in diverse physical and social settings.
Dynamic Pricing in High-Speed Railways Using Multi-Agent Reinforcement Learning
Villarrubia-Martin, Enrique Adrian, Rodriguez-Benitez, Luis, Muรฑoz-Valero, David, Montana, Giovanni, Jimenez-Linares, Luis
This paper addresses a critical challenge in the high-speed passenger railway industry: designing effective dynamic pricing strategies in the context of competing and cooperating operators. To address this, a multi-agent reinforcement learning (MARL) framework based on a non-zero-sum Markov game is proposed, incorporating random utility models to capture passenger decision making. Unlike prior studies in areas such as energy, airlines, and mobile networks, dynamic pricing for railway systems using deep reinforcement learning has received limited attention. A key contribution of this paper is a parametrisable and versatile reinforcement learning simulator designed to model a variety of railway network configurations and demand patterns while enabling realistic, microscopic modelling of user behaviour, called RailPricing-RL. This environment supports the proposed MARL framework, which models heterogeneous agents competing to maximise individual profits while fostering cooperative behaviour to synchronise connecting services. Experimental results validate the framework, demonstrating how user preferences affect MARL performance and how pricing policies influence passenger choices, utility, and overall system dynamics. This study provides a foundation for advancing dynamic pricing strategies in railway systems, aligning profitability with system-wide efficiency, and supporting future research on optimising pricing policies.
Lifelong Learning of Large Language Model based Agents: A Roadmap
Zheng, Junhao, Shi, Chengming, Cai, Xidi, Li, Qiuke, Zhang, Duzhen, Li, Chenxing, Yu, Dong, Ma, Qianli
Lifelong learning, also known as continual or incremental learning, is a crucial component for advancing Artificial General Intelligence (AGI) by enabling systems to continuously adapt in dynamic environments. While large language models (LLMs) have demonstrated impressive capabilities in natural language processing, existing LLM agents are typically designed for static systems and lack the ability to adapt over time in response to new challenges. This survey is the first to systematically summarize the potential techniques for incorporating lifelong learning into LLM-based agents. We categorize the core components of these agents into three modules: the perception module for multimodal input integration, the memory module for storing and retrieving evolving knowledge, and the action module for grounded interactions with the dynamic environment. We highlight how these pillars collectively enable continuous adaptation, mitigate catastrophic forgetting, and improve long-term performance. This survey provides a roadmap for researchers and practitioners working to develop lifelong learning capabilities in LLM agents, offering insights into emerging trends, evaluation metrics, and application scenarios. Relevant literature and resources are available at \href{this url}{https://github.com/qianlima-lab/awesome-lifelong-llm-agent}.
Speedup Techniques for Switchable Temporal Plan Graph Optimization
Jiang, He, Lin, Muhan, Li, Jiaoyang
Multi-Agent Path Finding (MAPF) focuses on planning collision-free paths for multiple agents. However, during the execution of a MAPF plan, agents may encounter unexpected delays, which can lead to inefficiencies, deadlocks, or even collisions. To address these issues, the Switchable Temporal Plan Graph provides a framework for finding an acyclic Temporal Plan Graph with the minimum execution cost under delays, ensuring deadlock- and collision-free execution. Unfortunately, existing optimal algorithms, such as Mixed Integer Linear Programming and Graph-Based Switchable Edge Search (GSES), are often too slow for practical use. This paper introduces Improved GSES, which significantly accelerates GSES through four speedup techniques: stronger admissible heuristics, edge grouping, prioritized branching, and incremental implementation. Experiments conducted on four different map types with varying numbers of agents demonstrate that Improved GSES consistently achieves over twice the success rate of GSES and delivers up to a 30-fold speedup on instances where both methods successfully find solutions.
Hierarchical Reinforcement Learning for Optimal Agent Grouping in Cooperative Systems
This paper presents a hierarchical reinforcement learning (RL) approach to address the agent grouping or pairing problem in cooperative multi-agent systems. The goal is to simultaneously learn the optimal grouping and agent policy. By employing a hierarchical RL framework, we distinguish between high-level decisions of grouping and low-level agents' actions. Our approach utilizes the CTDE (Centralized Training with Decentralized Execution) paradigm, ensuring efficient learning and scalable execution. We incorporate permutation-invariant neural networks to handle the homogeneity and cooperation among agents, enabling effective coordination. The option-critic algorithm is adapted to manage the hierarchical decision-making process, allowing for dynamic and optimal policy adjustments.
CORD: Generalizable Cooperation via Role Diversity
Matsuyama, Kanefumi, Su, Kefan, Wang, Jiangxing, Ye, Deheng, Lu, Zongqing
Cooperative multi-agent reinforcement learning (MARL) aims to develop agents that can collaborate effectively. However, most cooperative MARL methods overfit training agents, making learned policies not generalize well to unseen collaborators, which is a critical issue for real-world deployment. Some methods attempt to address the generalization problem but require prior knowledge or predefined policies of new teammates, limiting real-world applications. To this end, we propose a hierarchical MARL approach to enable generalizable cooperation via role diversity, namely CORD. CORD's high-level controller assigns roles to low-level agents by maximizing the role entropy with constraints. We show this constrained objective can be decomposed into causal influence in role that enables reasonable role assignment, and role heterogeneity that yields coherent, non-redundant role clusters. Evaluated on a variety of cooperative multi-agent tasks, CORD achieves better performance than baselines, especially in generalization tests. Ablation studies further demonstrate the efficacy of the constrained objective in generalizable cooperation.
Multi-Agent Collaboration Mechanisms: A Survey of LLMs
Tran, Khanh-Tung, Dao, Dung, Nguyen, Minh-Duong, Pham, Quoc-Viet, O'Sullivan, Barry, Nguyen, Hoang D.
With recent advances in Large Language Models (LLMs), Agentic AI has become phenomenal in real-world applications, moving toward multiple LLM-based agents to perceive, learn, reason, and act collaboratively. These LLM-based Multi-Agent Systems (MASs) enable groups of intelligent agents to coordinate and solve complex tasks collectively at scale, transitioning from isolated models to collaboration-centric approaches. This work provides an extensive survey of the collaborative aspect of MASs and introduces an extensible framework to guide future research. Our framework characterizes collaboration mechanisms based on key dimensions: actors (agents involved), types (e.g., cooperation, competition, or coopetition), structures (e.g., peer-to-peer, centralized, or distributed), strategies (e.g., role-based or model-based), and coordination protocols. Through a review of existing methodologies, our findings serve as a foundation for demystifying and advancing LLM-based MASs toward more intelligent and collaborative solutions for complex, real-world use cases. In addition, various applications of MASs across diverse domains, including 5G/6G networks, Industry 5.0, question answering, and social and cultural settings, are also investigated, demonstrating their wider adoption and broader impacts. Finally, we identify key lessons learned, open challenges, and potential research directions of MASs towards artificial collective intelligence.