Agent Societies
Towards an LLM-powered Social Digital Twinning Platform
Gürcan, Önder, Falck, Vanja, Rousseau, Markus G., Lima, Larissa L.
We present Social Digital Twinner, an innovative social simulation tool for exploring plausible effects of what-if scenarios in complex adaptive social systems. The architecture is composed of three seamlessly integrated parts: a data infrastructure featuring real-world data and a multi-dimensionally representative synthetic population of citizens, an LLM-enabled agent-based simulation engine, and a user interface that enable intuitive, natural language interactions with the simulation engine and the artificial agents (i.e. citizens). Social Digital Twinner facilitates real-time engagement and empowers stakeholders to collaboratively design, test, and refine intervention measures. The approach is promoting a data-driven and evidence-based approach to societal problem-solving. We demonstrate the tool's interactive capabilities by addressing the critical issue of youth school dropouts in Kragero, Norway, showcasing its ability to create and execute a dedicated social digital twin using natural language.
Explaining Strategic Decisions in Multi-Agent Reinforcement Learning for Aerial Combat Tactics
Selmonaj, Ardian, Antonucci, Alessandro, Schneider, Adrian, Rüegsegger, Michael, Sommer, Matthias
Artificial intelligence (AI) is reshaping strategic planning, with Multi-Agent Reinforcement Learning (MARL) enabling coordination among autonomous agents in complex scenarios. However, its practical deployment in sensitive military contexts is constrained by the lack of explainability, which is an essential factor for trust, safety, and alignment with human strategies. This work reviews and assesses current advances in explainability methods for MARL with a focus on simulated air combat scenarios. We proceed by adapting various explainability techniques to different aerial combat scenarios to gain explanatory insights about the model behavior. By linking AI-generated tactics with human-understandable reasoning, we emphasize the need for transparency to ensure reliable deployment and meaningful human-machine interaction. By illuminating the crucial importance of explainability in advancing MARL for operational defense, our work supports not only strategic planning but also the training of military personnel with insightful and comprehensible analyses.
Fixing Incomplete Value Function Decomposition for Multi-Agent Reinforcement Learning
Baisero, Andrea, Bhati, Rupali, Liu, Shuo, Pillai, Aathira, Amato, Christopher
Value function decomposition methods for cooperative multi-agent reinforcement learning compose joint values from individual per-agent utilities, and train them using a joint objective. To ensure that the action selection process between individual utilities and joint values remains consistent, it is imperative for the composition to satisfy the individual-global max (IGM) property. Although satisfying IGM itself is straightforward, most existing methods (e.g., VDN, QMIX) have limited representation capabilities and are unable to represent the full class of IGM values, and the one exception that has no such limitation (QPLEX) is unnecessarily complex. In this work, we present a simple formulation of the full class of IGM values that naturally leads to the derivation of QFIX, a novel family of value function decomposition models that expand the representation capabilities of prior models by means of a thin "fixing" layer. We derive multiple variants of QFIX, and implement three variants in two well-known multi-agent frameworks. We perform an empirical evaluation on multiple SMACv2 and Overcooked environments, which confirms that QFIX (i) succeeds in enhancing the performance of prior methods, (ii) learns more stably and performs better than its main competitor QPLEX, and (iii) achieves this while employing the simplest and smallest mixing models.
Enhancing Aerial Combat Tactics through Hierarchical Multi-Agent Reinforcement Learning
Selmonaj, Ardian, Szehr, Oleg, Del Rio, Giacomo, Antonucci, Alessandro, Schneider, Adrian, Rüegsegger, Michael
This is motivated by the strong performance of RL agents in finding effective Courses of Action (CoA) across a wide range of environments, including combinatorial settings such as Chess or Go [1], real-time continuous control tasks found in arcade video games [2], and scenarios that combine control with strategic decision-making, as seen in modern wargames [3]. The application of RL in the context of air combat comes with a number of specific challenges. Those include structural properties of the simulation scenario, such as the complexity of the individual units and their flight dynamics, the exponential size of the combined state and action spaces, the depth of the planning horizon, the presence of stochasticity and imperfect information, etc. Overall the size of the game tree (i.e., the set of possible CoAs) in strategic games and defense scenarios appears vast and beyond the access of straightforward search. Furthermore, real-world operations involve the simultaneous maneuverings of individual units, but also be- ing mindful of the strategic positions and global mission planning. Training policies that integrate real-time control at the troop level with high-level mission planning at the commander level is challenging, as these tasks inherently demand distinct system requirements, algorithmic approaches, and training configurations.
Conceptual Logical Foundations of Artificial Social Intelligence
What makes a society possible at all? How is coordination and cooperation in social activity possible? What is the minimal mental architecture of a social agent? How is the information about the state of the world related to the agents intentions? How are the intentions of agents related? What role does communication play in this coordination process? This essay explores the conceptual and logical foundations of artificial social intelligence in the context of a society of multiple agents that communicate and cooperate to achieve some end. An attempt is made to provide an introduction to some of the key concepts, their formal definitions and their interrelationships. These include the notion of a changing social world of multiple agents. The logic of social intelligence goes beyond classical logic by linking information with strategic thought. A minimal architecture of social agents is presented. The agents have different dynamically changing, possible choices and abilities. The agents also have uncertainty, lacking perfect information about their physical state as well as their dynamic social state. The social state of an agent includes the intentional state of that agent, as well as, that agent's representation of the intentional states of other agents. Furthermore, it includes the evaluations agents make of their physical and social condition. Communication, semantic and pragmatic meaning and their relationship to intention and information states are investigated. The logic of agent abilities and intentions are motivated and formalized. The entropy of group strategic states is defined.
Credit Assignment and Efficient Exploration based on Influence Scope in Multi-agent Reinforcement Learning
Han, Shuai, Dastani, Mehdi, Wang, Shihan
Training cooperative agents in sparse-reward scenarios poses significant challenges for multi-agent reinforcement learning (MARL). Without clear feedback on actions at each step in sparse-reward setting, previous methods struggle with precise credit assignment among agents and effective exploration. In this paper, we introduce a novel method to deal with both credit assignment and exploration problems in reward-sparse domains. Accordingly, we propose an algorithm that calculates the Influence Scope of Agents (ISA) on states by taking specific value of the dimensions/attributes of states that can be influenced by individual agents. The mutual dependence between agents' actions and state attributes are then used to calculate the credit assignment and to delimit the exploration space for each individual agent. We then evaluate ISA in a variety of sparse-reward multi-agent scenarios. The results show that our method significantly outperforms the state-of-art baselines.
MC-Swarm: Minimal-Communication Multi-Agent Trajectory Planning and Deadlock Resolution for Quadrotor Swarm
--For effective multi-agent trajectory planning, it is important to consider lightweight communication and its potential asynchrony. This paper presents a distributed trajectory planning algorithm for a quadrotor swarm that operates asynchronously and requires no communication except during the initial planning phase. T o effectively ensure these points, we build two main modules: coordination state updater and trajectory optimizer . The coordination state updater computes waypoints for each agent toward its goal and performs subgoal optimization while considering deadlocks, as well as safety constraints with respect to neighbor agents and obstacles. Then, the trajectory optimizer generates a trajectory that ensures collision avoidance even with the asynchronous planning updates of neighboring agents. We provide a theoretical guarantee of collision avoidance with deadlock resolution and evaluate the effectiveness of our method in complex simulation environments, including random forests and narrow-gap mazes. Additionally, to reduce the total mission time, we design a faster coordination state update using lightweight communication. Lastly, our approach is validated through extensive simulations and real-world experiments with cluttered environment scenarios. Index T erms --Path Planning for Multiple Mobile Robots, Collision A voidance, Distributed Robot Systems. HE compactness of quadrotor drones enables the operation of multi-agent systems in cluttered environments. While small teams of drones can be manually controlled by human pilots, large-scale swarms require autonomous coordination, where multi-agent trajectory planning (MA TP) serves as a critical component. Over the past decade, MA TP has been extensively studied, leading to its adoption in various applications, such as surveillance [1], inspection [2], and transportation [3]. Many existing MA TP frameworks rely on synchronous coordination, where agents repeatedly exchange information to maintain consistency during planning and execution [4]. However, as the number of agents increases, the communication load grows significantly, often resulting in message delays and packet losses. The author is with AI Institute of Seoul National University, Seoul, South Korea, and Carnegie Mellon University, Pittsburgh, P A, USA (e-mail: yunwoo333@gmail.com) The author is with the Department of Mechanical System Design Engineering, Seoul National University of Science and Technology (SEOUL-TECH), Seoul, South Korea (e-mail: jungwonpark@seoultech.ac.kr)
Agent-as-a-Service based on Agent Network
Zhu, Yuhan, Liu, Haojie, Wang, Jian, Li, Bing, Yin, Zikang, Liao, Yefei
The rise of large model-based AI agents has spurred interest in Multi-Agent Systems (MAS) for their capabilities in decision-making, collaboration, and adaptability. While the Model Context Protocol (MCP) addresses tool invocation and data exchange challenges via a unified protocol, it lacks support for organizing agent-level collaboration. To bridge this gap, we propose Agent-as-a-Service based on Agent Network (AaaS-AN), a service-oriented paradigm grounded in the Role-Goal-Process-Service (RGPS) standard. AaaS-AN unifies the entire agent lifecycle, including construction, integration, interoperability, and networked collaboration, through two core components: (1) a dynamic Agent Network, which models agents and agent groups as vertexes that self-organize within the network based on task and role dependencies; (2) service-oriented agents, incorporating service discovery, registration, and interoperability protocols. These are orchestrated by a Service Scheduler, which leverages an Execution Graph to enable distributed coordination, context tracking, and runtime task management. We validate AaaS-AN on mathematical reasoning and application-level code generation tasks, which outperforms state-of-the-art baselines. Notably, we constructed a MAS based on AaaS-AN containing agent groups, Robotic Process Automation (RPA) workflows, and MCP servers over 100 agent services. We also release a dataset containing 10,000 long-horizon multi-agent workflows to facilitate future research on long-chain collaboration in MAS.
PRISM: Complete Online Decentralized Multi-Agent Pathfinding with Rapid Information Sharing using Motion Constraints
Lee, Hannah, Serlin, Zachary, Motes, James, Long, Brendan, Morales, Marco, Amato, Nancy M.
We introduce PRISM (Pathfinding with Rapid Information Sharing using Motion Constraints), a decentralized algorithm designed to address the multi-task multi-agent pathfinding (MT-MAPF) problem. PRISM enables large teams of agents to concurrently plan safe and efficient paths for multiple tasks while avoiding collisions. It employs a rapid communication strategy that uses information packets to exchange motion constraint information, enhancing cooperative pathfinding and situational awareness, even in scenarios without direct communication. We prove that PRISM resolves and avoids all deadlock scenarios when possible, a critical challenge in decentralized pathfinding. Empirically, we evaluate PRISM across five environments and 25 random scenarios, benchmarking it against the centralized Conflict-Based Search (CBS) and the decentralized Token Passing with Task Swaps (TPTS) algorithms. PRISM demonstrates scalability and solution quality, supporting 3.4 times more agents than CBS and handling up to 2.5 times more tasks in narrow passage environments than TPTS. Additionally, PRISM matches CBS in solution quality while achieving faster computation times, even under low-connectivity conditions. Its decentralized design reduces the computational burden on individual agents, making it scalable for large environments. These results confirm PRISM's robustness, scalability, and effectiveness in complex and dynamic pathfinding scenarios.
EcoLANG: Efficient and Effective Agent Communication Language Induction for Social Simulation
Mou, Xinyi, Qian, Chen, Liu, Wei, Huang, Xuanjing, Wei, Zhongyu
Large language models (LLMs) have demonstrated an impressive ability to role-play humans and replicate complex social dynamics. While large-scale social simulations are gaining increasing attention, they still face significant challenges, particularly regarding high time and computation costs. Existing solutions, such as distributed mechanisms or hybrid agent-based model (ABM) integrations, either fail to address inference costs or compromise accuracy and generalizability. To this end, we propose EcoLANG: Efficient and Effective Agent Communication Language Induction for Social Simulation. EcoLANG operates in two stages: (1) language evolution, where we filter synonymous words and optimize sentence-level rules through natural selection, and (2) language utilization, where agents in social simulations communicate using the evolved language. Experimental results demonstrate that EcoLANG reduces token consumption by over 20%, enhancing efficiency without sacrificing simulation accuracy.