Goto

Collaborating Authors

 Agent Societies


Agentic Discovery: Closing the Loop with Cooperative Agents

arXiv.org Artificial Intelligence

Abstract--As data-driven methods, artificial intelligence (AI), and automated workflows accelerate scientific tasks, we see the rate of discovery increasingly limited by human decision-making tasks such as setting objectives, generating hypotheses, and designing experiments. We postulate that cooperative agents are needed to augment the role of humans and enable autonomous discovery . Realizing such agents will require progress in both AI and infrastructure. This situation is emblematic of broader transformations associated with the fourth and fifth paradigms of science, which capture the shift towards data-intensive methods and artificial intelligence, respectively, as integral aspects of scientific exploration [1], [2]. Fields ranging from astrophysics to social sciences now rely on vast datasets, AI models, and computational methods to drive innovation. Hence we face the challenge of not just managing data and building models, but also building systems that enable researchers to integrate and utilize data and models at scale. Current approaches to integrating data-intensive workflows and AI methods have yielded successes, but use techniques that result in siloed solutions that fail to scale or generalize. This paradigm shift demands more than building increasingly sophisticated tools; it calls for a fundamental rethinking of how science is conducted.


PSN Game: Game-theoretic Prediction and Planning via a Player Selection Network

arXiv.org Artificial Intelligence

While game-theoretic planning frameworks are effective at modeling multi-agent interactions, they require solving large optimization problems where the number of variables increases with the number of agents, resulting in long computation times that limit their use in large-scale, real-time systems. To address this issue, we propose 1) PSN Game: a learning-based, game-theoretic prediction and planning framework that reduces runtime by learning a Player Selection Network (PSN); and 2) a Goal Inference Network (GIN) that makes it possible to use the PSN in incomplete information games where agents' intentions are unknown. A PSN outputs a player selection mask that distinguishes influential players from less relevant ones, enabling the ego player to solve a smaller, masked game involving only selected players. By reducing the number of players in the game, and therefore reducing the number of variables in the corresponding optimization problem, PSN directly lowers computation time. The PSN Game framework is more flexible than existing player selection methods as it 1) relies solely on observations of players' past trajectories, without requiring full state, action, or other game-specific information; and 2) requires no online parameter tuning. Experiments in both simulated scenarios and human trajectory datasets demonstrate that PSNs outperform baseline selection methods in 1) prediction accuracy; and 2) planning safety. PSNs also generalize effectively to real-world scenarios in which agents' objectives are unknown without fine-tuning. By selecting only the most relevant players for decision-making, PSN Game offers a general mechanism for reducing planning complexity that can be seamlessly integrated into existing multi-agent planning frameworks.


MTOS: A LLM-Driven Multi-topic Opinion Simulation Framework for Exploring Echo Chamber Dynamics

arXiv.org Artificial Intelligence

The polarization of opinions, information segregation, and cognitive biases on social media have attracted significant academic attention. In real-world networks, information often spans multiple interrelated topics, posing challenges for opinion evolution and highlighting the need for frameworks that simulate interactions among topics. Existing studies based on large language models (LLMs) focus largely on single topics, limiting the capture of cognitive transfer in multi-topic, cross-domain contexts. Traditional numerical models, meanwhile, simplify complex linguistic attitudes into discrete values, lacking interpretability, behavioral consistency, and the ability to integrate multiple topics. To address these issues, we propose Multi-topic Opinion Simulation (MTOS), a social simulation framework integrating multi-topic contexts with LLMs. MTOS leverages LLMs alongside short-term and long-term memory, incorporates multiple user-selection interaction mechanisms and dynamic topic-selection strategies, and employs a belief decay mechanism to enable perspective updates across topics. We conduct extensive experiments on MTOS, varying topic numbers, correlation types, and performing ablation studies to assess features such as group polarization and local consistency. Results show that multi-topic settings significantly alter polarization trends: positively correlated topics amplify echo chambers, negatively correlated topics inhibit them, and irrelevant topics also mitigate echo chamber effects through resource competition. Compared with numerical models, LLM-based agents realistically simulate dynamic opinion changes, reproduce linguistic features of news texts, and capture complex human reasoning, improving simulation interpretability and system stability.


Heterogeneous RBCs via deep multi-agent reinforcement learning

arXiv.org Artificial Intelligence

Current macroeconomic models with agent heterogeneity can be broadly divided into two main groups. Heterogeneous-agent general equilibrium (GE) models, such as those based on Heterogeneous Agents New Keynesian (HANK) or Krusell-Smith (KS) approaches, rely on GE and 'rational expectations', somewhat unrealistic assumptions that make the models very computationally cumbersome, which in turn limits the amount of heterogeneity that can be modelled. In contrast, agent-based models (ABMs) can flexibly encompass a large number of arbitrarily heterogeneous agents, but typically require the specification of explicit behavioural rules, which can lead to a lengthy trial-and-error model-development process. To address these limitations, we introduce MARL-BC, a framework that integrates deep multi-agent reinforcement learning (MARL) with Real Business Cycle (RBC) models. We demonstrate that MARL-BC can: (1) recover textbook RBC results when using a single agent; (2) recover the results of the mean-field KS model using a large number of identical agents; and (3) effectively simulate rich heterogeneity among agents, a hard task for traditional GE approaches. Our framework can be thought of as an ABM if used with a variety of heterogeneous interacting agents, and can reproduce GE results in limit cases. As such, it is a step towards a synthesis of these often opposed modelling paradigms.


Mean-Field Games with Constraints

arXiv.org Artificial Intelligence

This paper introduces a framework of Constrained Mean-Field Games (CMFGs), where each agent solves a constrained Markov decision process (CMDP). This formulation captures scenarios in which agents' strategies are subject to feasibility, safety, or regulatory restrictions, thereby extending the scope of classical mean field game (MFG) models. We first establish the existence of CMFG equilibria under a strict feasibility assumption, and we further show uniqueness under a classical monotonicity condition. To compute equilibria, we develop Constrained Mean-Field Occupation Measure Optimization (CMFOMO), an optimization-based scheme that parameterizes occupation measures and shows that finding CMFG equilibria is equivalent to solving a single optimization problem with convex constraints and bounded variables. CMFOMO does not rely on uniqueness of the equilibria and can approximate all equilibria with arbitrary accuracy. We further prove that CMFG equilibria induce $O(1 / \sqrt{N})$-Nash equilibria in the associated constrained $N$-player games, thereby extending the classical justification of MFGs as approximations for large but finite systems. Numerical experiments on a modified Susceptible-Infected-Susceptible (SIS) epidemic model with various constraints illustrate the effectiveness and flexibility of the framework.


Autonomous vehicles need social awareness to find optima in multi-agent reinforcement learning routing games

arXiv.org Artificial Intelligence

Previous work has shown that when multiple selfish Autonomous Vehicles (AVs) are introduced to future cities and start learning optimal routing strategies using Multi-Agent Reinforcement Learning (MARL), they may destabilize traffic systems, as they would require a significant amount of time to converge to the optimal solution, equivalent to years of real-world commuting. We demonstrate that moving beyond the selfish component in the reward significantly relieves this issue. If each AV, apart from minimizing its own travel time, aims to reduce its impact on the system, this will be beneficial not only for the system-wide performance but also for each individual player in this routing game. By introducing an intrinsic reward signal based on the marginal cost matrix, we significantly reduce training time and achieve convergence more reliably. Marginal cost quantifies the impact of each individual action (route-choice) on the system (total travel time). Including it as one of the components of the reward can reduce the degree of non-stationarity by aligning agents' objectives. Notably, the proposed counterfactual formulation preserves the system's equilibria and avoids oscillations. Our experiments show that training MARL algorithms with our novel reward formulation enables the agents to converge to the optimal solution, whereas the baseline algorithms fail to do so. We show these effects in both a toy network and the real-world network of Saint-Arnoult. Our results optimistically indicate that social awareness (i.e., including marginal costs in routing decisions) improves both the system-wide and individual performance of future urban systems with AVs.


Mediator-Guided Multi-Agent Collaboration among Open-Source Models for Medical Decision-Making

arXiv.org Artificial Intelligence

Complex medical decision-making involves cooperative workflows operated by different clinicians. Designing AI multi-agent systems can expedite and augment human-level clinical decision-making. Existing multi-agent researches primarily focus on language-only tasks, yet their extension to multimodal scenarios remains challenging. A blind combination of diverse vision-language models (VLMs) can amplify an erroneous outcome interpretation. VLMs in general are less capable in instruction following and importantly self-reflection, compared to large language models (LLMs) of comparable sizes. This disparity largely constrains VLMs' ability in cooperative workflows. In this study, we propose MedOrch, a mediator-guided multi-agent collaboration framework for medical multimodal decision-making. MedOrch employs an LLM-based mediator agent that enables multiple VLM-based expert agents to exchange and reflect on their outputs towards collaboration. We utilize multiple open-source general-purpose and domain-specific VLMs instead of costly GPT-series models, revealing the strength of heterogeneous models. We show that the collaboration within distinct VLM-based agents can surpass the capabilities of any individual agent. We validate our approach on five medical vision question answering benchmarks, demonstrating superior collaboration performance without model training. Our findings underscore the value of mediator-guided multi-agent collaboration in advancing medical multimodal intelligence.


Fake News in Social Networks

arXiv.org Artificial Intelligence

We propose multi-agent reinforcement learning as a new method for modeling fake news in social networks. This method allows us to model human behavior in social networks both in unaccustomed populations and in populations that have adapted to the presence of fake news. In particular the latter is challenging for existing methods. We find that a fake-news attack is more effective if it targets highly connected people and people with weaker private information. Attacks are more effective when the disinformation is spread across several agents than when the disinformation is concentrated with more intensity on fewer agents. Furthermore, fake news spread less well in balanced networks than in clustered networks. We test a part of our findings in a human-subject experiment. The experimental evidence provides support for the predictions from the model, suggesting that the model is suitable to analyze the spread of fake news in social networks.


StoryBox: Collaborative Multi-Agent Simulation for Hybrid Bottom-Up Long-Form Story Generation Using Large Language Models

arXiv.org Artificial Intelligence

Human writers often begin their stories with an overarching mental scene, where they envision the interactions between characters and their environment. Inspired by this creative process, we propose a novel approach to long-form story generation, termed hybrid bottom-up long-form story generation, using multi-agent simulations. In our method, agents interact within a dynamic sandbox environment, where their behaviors and interactions with one another and the environment generate emergent events. These events form the foundation for the story, enabling organic character development and plot progression. Unlike traditional top-down approaches that impose rigid structures, our hybrid bottom-up approach allows for the natural unfolding of events, fostering more spontaneous and engaging storytelling. The system is capable of generating stories exceeding 10,000 words while maintaining coherence and consistency, addressing some of the key challenges faced by current story generation models. We achieve state-of-the-art performance across several metrics. This approach offers a scalable and innovative solution for creating dynamic, immersive long-form stories that evolve organically from agent-driven interactions.


Talk Isn't Always Cheap: Understanding Failure Modes in Multi-Agent Debate

arXiv.org Artificial Intelligence

While multi-agent debate has been proposed as a promising strategy for improving AI reasoning ability, we find that debate can sometimes be harmful rather than helpful. Prior work has primarily focused on debates within homogeneous groups of agents, whereas we explore how diversity in model capabilities influences the dynamics and outcomes of multi-agent interactions. Through a series of experiments, we demonstrate that debate can lead to a decrease in accuracy over time - even in settings where stronger (i.e., more capable) models outnumber their weaker counterparts. Our analysis reveals that models frequently shift from correct to incorrect answers in response to peer reasoning, favoring agreement over challenging flawed reasoning. We perform additional experiments investigating various potential contributing factors to these harmful shifts - including sycophancy, social conformity, and model and task type. These results highlight important failure modes in the exchange of reasons during multi-agent debate, suggesting that naive applications of debate may cause performance degradation when agents are neither incentivised nor adequately equipped to resist persuasive but incorrect reasoning.