Goto

Collaborating Authors

 Agent Societies


Design and Analysis of an Extreme-Scale, High-Performance, and Modular Agent-Based Simulation Platform

arXiv.org Artificial Intelligence

Agent-based modeling is indispensable for studying complex systems across many domains. However, existing simulation platforms exhibit two major issues: performance and modularity. Low performance prevents simulations with a large number of agents, increases development time, limits parameter exploration, and raises computing costs. Inflexible software designs motivate modelers to create their own tools, diverting valuable resources. This dissertation introduces a novel simulation platform called BioDynaMo and its significant improvement, TeraAgent, to alleviate these challenges via three major works. First, we lay the platform's foundation by defining abstractions, establishing software infrastructure, and implementing a multitude of features for agent-based modeling. We demonstrate BioDynaMo's modularity through use cases in neuroscience, epidemiology, and oncology. We validate these models and show the simplicity of adding new functionality with few lines of code. Second, we perform a rigorous performance analysis and identify challenges for shared-memory parallelism. Provided solutions include an optimized grid for neighbor searching, mechanisms to reduce the memory access latency, and exploiting domain knowledge to omit unnecessary work. These improvements yield up to three orders of magnitude speedups, enabling simulations of 1.7 billion agents on a single server. Third, we present TeraAgent, a distributed simulation engine that allows scaling out the computation of one simulation to multiple servers. We identify and address server communication bottlenecks and implement solutions for serialization and delta encoding to accelerate and reduce data transfer. TeraAgent can simulate 500 billion agents and scales to 84096 CPU cores. BioDynaMo has been widely adopted, including a prize-winning radiotherapy simulation recognized as a top 10 breakthrough in physics in 2024.


SySLLM: Generating Synthesized Policy Summaries for Reinforcement Learning Agents Using Large Language Models

arXiv.org Artificial Intelligence

Policies generated by Reinforcement Learning (RL) algorithms can be difficult to describe to users, as they result from the interplay between complex reward structures and neural network-based representations. This combination often leads to unpredictable behaviors, making policies challenging to analyze and posing significant obstacles to fostering human trust in real-world applications. Global policy summarization methods aim to describe agent behavior through a demonstration of actions in a subset of world-states. However, users can only watch a limited number of demonstrations, restricting their understanding of policies. Moreover, those methods overly rely on user interpretation, as they do not synthesize observations into coherent patterns. In this work, we present SySLLM (Synthesized Summary using LLMs), a novel method that employs synthesis summarization, utilizing large language models' (LLMs) extensive world knowledge and ability to capture patterns, to generate textual summaries of policies. Specifically, an expert evaluation demonstrates that the proposed approach generates summaries that capture the main insights generated by experts while not resulting in significant hallucinations. Additionally, a user study shows that SySLLM summaries are preferred over demonstration-based policy summaries and match or surpass their performance in objective agent identification tasks.


Multi-Agent Q-Learning Dynamics in Random Networks: Convergence due to Exploration and Sparsity

arXiv.org Artificial Intelligence

Beyond specific settings, many multi-agent learning algorithms fail to converge to an equilibrium solution, and instead display complex, non-stationary behaviours such as recurrent or chaotic orbits. In fact, recent literature suggests that such complex behaviours are likely to occur when the number of agents increases. In this paper, we study Q-learning dynamics in network polymatrix games where the network structure is drawn from classical random graph models. In particular, we focus on the Erdos-Renyi model, a well-studied model for social networks, and the Stochastic Block model, which generalizes the above by accounting for community structures within the network. In each setting, we establish sufficient conditions under which the agents' joint strategies converge to a unique equilibrium. We investigate how this condition depends on the exploration rates, payoff matrices and, crucially, the sparsity of the network. Finally, we validate our theoretical findings through numerical simulations and demonstrate that convergence can be reliably achieved in many-agent systems, provided network sparsity is controlled.


PairVDN - Pair-wise Decomposed Value Functions

arXiv.org Artificial Intelligence

Extending deep Q-learning to cooperative multi-agent settings is challenging due to the exponential growth of the joint action space, the non-stationary environment, and the credit assignment problem. Value decomposition allows deep Q-learning to be applied at the joint agent level, at the cost of reduced expressivity. Building on past work in this direction, our paper proposes PairVDN, a novel method for decomposing the value function into a collection of pair-wise, rather than per-agent, functions, improving expressivity at the cost of requiring a more complex (but still efficient) dynamic programming maximisation algorithm. Our method enables the representation of value functions which cannot be expressed as a monotonic combination of per-agent functions, unlike past approaches such as VDN and QMIX. We implement a novel many-agent cooperative environment, Box Jump, and demonstrate improved performance over these baselines in this setting. We open-source our code and environment at https://github.com/zzbuzzard/PairVDN.


Networked Communication for Decentralised Cooperative Agents in Mean-Field Control

arXiv.org Artificial Intelligence

We introduce networked communication to mean-field control (MFC) - the cooperative counterpart to mean-field games (MFGs) - and in particular to the setting where decentralised agents learn online from a single, non-episodic run of the empirical system. We adapt recent algorithms for MFGs to this new setting, as well as contributing a novel sub-routine allowing networked agents to estimate the global average reward from their local neighbourhood. We show that the networked communication scheme allows agents to increase social welfare faster than under both the centralised and independent architectures, by computing a population of potential updates in parallel and then propagating the highest-performing ones through the population, via a method that can also be seen as tackling the credit-assignment problem. We prove this new result theoretically and provide experiments that support it across numerous games, as well as exploring the empirical finding that smaller communication radii can benefit convergence in a specific class of game while still outperforming agents learning entirely independently. We provide numerous ablation studies and additional experiments on numbers of communication round and robustness to communication failures.


COLA: A Scalable Multi-Agent Framework For Windows UI Task Automation

arXiv.org Artificial Intelligence

With the rapid advancements in Large Language Models (LLMs), an increasing number of studies have leveraged LLMs as the cognitive core of agents to address complex task decision-making challenges. Specially, recent research has demonstrated the potential of LLM-based agents on automating Windows GUI operations. However, existing methodologies exhibit two critical challenges: (1) static agent architectures fail to dynamically adapt to the heterogeneous requirements of OS-level tasks, leading to inadequate scenario generalization;(2) the agent workflows lack fault tolerance mechanism, necessitating complete process re-execution for UI agent decision error. To address these limitations, we introduce \textit{COLA}, a collaborative multi-agent framework for automating Windows UI operations. In this framework, a scenario-aware agent Task Scheduler decomposes task requirements into atomic capability units, dynamically selects the optimal agent from a decision agent pool, effectively responds to the capability requirements of diverse scenarios. The decision agent pool supports plug-and-play expansion for enhanced flexibility. In addition, we design a memory unit equipped to all agents for their self-evolution. Furthermore, we develop an interactive backtracking mechanism that enables human to intervene to trigger state rollbacks for non-destructive process repair. Our experimental results on the GAIA benchmark demonstrates that the \textit{COLA} framework achieves state-of-the-art performance with an average score of 31.89\%, significantly outperforming baseline approaches without web API integration. Ablation studies further validate the individual contributions of our dynamic scheduling. The code is available at https://github.com/Alokia/COLA-demo.


Steering No-Regret Agents in MFGs under Model Uncertainty

arXiv.org Machine Learning

Incentive design is a popular framework for guiding agents' learning dynamics towards desired outcomes by providing additional payments beyond intrinsic rewards. However, most existing works focus on a finite, small set of agents or assume complete knowledge of the game, limiting their applicability to real-world scenarios involving large populations and model uncertainty. To address this gap, we study the design of steering rewards in Mean-Field Games (MFGs) with density-independent transitions, where both the transition dynamics and intrinsic reward functions are unknown. This setting presents non-trivial challenges, as the mediator must incentivize the agents to explore for its model learning under uncertainty, while simultaneously steer them to converge to desired behaviors without incurring excessive incentive payments. Assuming agents exhibit no(-adaptive) regret behaviors, we contribute novel optimistic exploration algorithms. Theoretically, we establish sub-linear regret guarantees for the cumulative gaps between the agents' behaviors and the desired ones. In terms of the steering cost, we demonstrate that our total incentive payments incur only sub-linear excess, competing with a baseline steering strategy that stabilizes the target policy as an equilibrium. Our work presents an effective framework for steering agents behaviors in large-population systems under uncertainty.


InfluenceNet: AI Models for Banzhaf and Shapley Value Prediction

arXiv.org Artificial Intelligence

Power indices are essential in assessing the contribution and influence of individual agents in multi-agent systems, providing crucial insights into collaborative dynamics and decision-making processes. While invaluable, traditional computational methods for exact or estimated power indices values require significant time and computational constraints, especially for large $(n\ge10)$ coalitions. These constraints have historically limited researchers' ability to analyse complex multi-agent interactions comprehensively. To address this limitation, we introduce a novel Neural Networks-based approach that efficiently estimates power indices for voting games, demonstrating comparable and often superiour performance to existing tools in terms of both speed and accuracy. This method not only addresses existing computational bottlenecks, but also enables rapid analysis of large coalitions, opening new avenues for multi-agent system research by overcoming previous computational limitations and providing researchers with a more accessible, scalable analytical tool.This increased efficiency will allow for the analysis of more complex and realistic multi-agent scenarios.


A Cascading Cooperative Multi-agent Framework for On-ramp Merging Control Integrating Large Language Models

arXiv.org Artificial Intelligence

Traditional Reinforcement Learning (RL) suffers from replicating human-like behaviors, generalizing effectively in multi-agent scenarios, and overcoming inherent interpretability issues.These tasks are compounded when deep environment understanding, agent coordination and dynamic optimization are required. While Large Language Model (LLM) enhanced methods have shown promise in generalization and interoperability, they often neglect necessary multi-agent coordination. Therefore, we introduce the Cascading Cooperative Multi-agent (CCMA) framework, integrating RL for individual interactions, a fine-tuned LLM for regional cooperation, a reward function for global optimization, and the Retrieval-augmented Generation mechanism to dynamically optimize decision-making across complex driving scenarios. Our experiments demonstrate that the CCMA outperforms existing RL methods, demonstrating significant improvements in both micro and macro-level performance in complex driving environments.


Fully-Decentralized MADDPG with Networked Agents

arXiv.org Artificial Intelligence

In this paper, we devise three actor-critic algorithms with decentralized training for multi-agent reinforcement learning in cooperative, adversarial, and mixed settings with continuous action spaces. To this goal, we adapt the MADDPG algorithm by applying a networked communication approach between agents. We introduce surrogate policies in order to decentralize the training while allowing for local communication during training. The decentralized algorithms achieve comparable results to the original MADDPG in empirical tests, while reducing computational cost. This is more pronounced with larger numbers of agents.