Agent Societies
Dilution, Diffusion and Symbiosis in Spatial Prisoner's Dilemma with Reinforcement Learning
Mangold, Gustavo C., Fernandes, Heitor C. M., Vainstein, Mendeli H.
Recent studies in the spatial prisoner's dilemma games with reinforcement learning have shown that static agents can learn to cooperate through a diverse sort of mechanisms, including noise injection, different types of learning algorithms and neighbours' payoff knowledge. In this work, using an independent multi-agent Q-learning algorithm, we study the effects of dilution and mobility in the spatial version of the prisoner's dilemma. Within this setting, different possible actions for the algorithm are defined, connecting with previous results on the classical, non-reinforcement learning spatial prisoner's dilemma, showcasing the versatility of the algorithm in modeling different game-theoretical scenarios and the benchmarking potential of this approach. As a result, a range of effects is observed, including evidence that games with fixed update rules can be qualitatively equivalent to those with learned ones, as well as the emergence of a symbiotic mutualistic effect between populations that forms when multiple actions are defined.
Modeling Latent Partner Strategies for Adaptive Zero-Shot Human-Agent Collaboration
Li, Benjamin, Shi, Shuyang, Romero, Lucia, Li, Huao, Xie, Yaqi, Kim, Woojun, Nikolaidis, Stefanos, Lewis, Michael, Sycara, Katia, Stepputtis, Simon
In collaborative tasks, being able to adapt to your teammates is a necessary requirement for success. When teammates are heterogeneous, such as in human-agent teams, agents need to be able to observe, recognize, and adapt to their human partners in real time. This becomes particularly challenging in tasks with time pressure and complex strategic spaces where the dynamics can change rapidly. In this work, we introduce TALENTS, a strategy-conditioned cooperator framework that learns to represent, categorize, and adapt to a range of partner strategies, enabling ad-hoc teamwork. Our approach utilizes a variational autoencoder to learn a latent strategy space from trajectory data. This latent space represents the underlying strategies that agents employ. Subsequently, the system identifies different types of strategy by clustering the data. Finally, a cooperator agent is trained to generate partners for each type of strategy, conditioned on these clusters. In order to adapt to previously unseen partners, we leverage a fixed-share regret minimization algorithm that infers and adjusts the estimated partner strategy dynamically. We assess our approach in a customized version of the Overcooked environment, posing a challenging cooperative cooking task that demands strong coordination across a wide range of possible strategies. Using an online user study, we show that our agent outperforms current baselines when working with unfamiliar human partners.
Enhancing Robustness of LLM-Driven Multi-Agent Systems through Randomized Smoothing
Hu, Jinwei, Dong, Yi, Ding, Zhengtao, Huang, Xiaowei
This paper presents a defense framework for enhancing the safety of large language model (LLM) empowered multi-agent systems (MAS) in safety-critical domains such as aerospace. We apply randomized smoothing, a statistical robustness certification technique, to the MAS consensus context, enabling probabilistic guarantees on agent decisions under adversarial influence. Unlike traditional verification methods, our approach operates in black-box settings and employs a two-stage adaptive sampling mechanism to balance robustness and computational efficiency. Simulation results demonstrate that our method effectively prevents the propagation of adversarial behaviors and hallucinations while maintaining consensus performance. This work provides a practical and scalable path toward safe deployment of LLM-based MAS in real-world, high-stakes environments.
Rethinking the Illusion of Thinking
Varela, Iñaki Dellibarda, Romero-Sorozabal, Pablo, Rocon, Eduardo, Cebrian, Manuel
Earlier this year, Apple ignited controversy by publishing "The Illusion of Thinking," prompting heated debate within the AI community. Critics seized upon the findings as conclusive evidence that Large Reasoning Models (LRMs) lack genuine reasoning capabilities, branding them as mere stochastic parrots. Meanwhile, defenders-spearheaded by Lawsen et al. (2025)-fired back, condemning the experimental setup as flawed and the conclusions overstated. We clarify this debate by replicating and refining two of the original study's most contentious benchmarks: Towers of Hanoi and River Crossing. By introducing incremental stepwise prompting and agentic collaborative dialogue, we show that previously reported failures solving the Towers of Hanoi were not purely result of output constraints, but also partly a result of cognition limitations: LRMs still stumble when complexity rises moderately (around 8 disks). Moreover, the River Crossing results initially heralded as catastrophic failures turn out to hinge upon testing unsolvable configurations. Once we limit tests strictly to solvable problems-LRMs effortlessly solve large instances involving over 100 agent pairs. Our findings ultimately defy simplistic narratives: today's LRMs are stochastic, RL-tuned searchers in a discrete state space we barely understand. Real progress in symbolic, long-horizon reasoning demands mapping that terrain through fine-grained ablations like those introduced here.
CARTS: Collaborative Agents for Recommendation Textual Summarization
Chen, Jiao, Yao, Kehui, Maragheh, Reza Yousefi, Zhao, Kai, Xu, Jianpeng, Cho, Jason, Korpeoglu, Evren, Kumar, Sushant, Achan, Kannan
Current recommendation systems often require some form of textual data summarization, such as generating concise and coherent titles for product carousels or other grouped item displays. While large language models have shown promise in NLP domains for textual summarization, these approaches do not directly apply to recommendation systems, where explanations must be highly relevant to the core features of item sets, adhere to strict word limit constraints. In this paper, we propose CARTS (Collaborative Agents for Recommendation Textual Summarization), a multi-agent LLM framework designed for structured summarization in recommendation systems. CARTS decomposes the task into three stages-Generation Augmented Generation (GAG), refinement circle, and arbitration, where successive agent roles are responsible for extracting salient item features, iteratively refining candidate titles based on relevance and length feedback, and selecting the final title through a collaborative arbitration process. Experiments on large-scale e-commerce data and live A/B testing show that CARTS significantly outperforms single-pass and chain-of-thought LLM baselines, delivering higher title relevance and improved user engagement metrics.
Enhancing LLM Agent Safety via Causal Influence Prompting
Hahm, Dongyoon, Jin, Woogyeol, Choi, June Suk, Ahn, Sungsoo, Lee, Kimin
As autonomous agents powered by large language models (LLMs) continue to demonstrate potential across various assistive tasks, ensuring their safe and reliable behavior is crucial for preventing unintended consequences. In this work, we introduce CIP, a novel technique that leverages causal influence diagrams (CIDs) to identify and mitigate risks arising from agent decision-making. CIDs provide a structured representation of cause-and-effect relationships, enabling agents to anticipate harmful outcomes and make safer decisions. Our approach consists of three key steps: (1) initializing a CID based on task specifications to outline the decision-making process, (2) guiding agent interactions with the environment using the CID, and (3) iteratively refining the CID based on observed behaviors and outcomes. Experimental results demonstrate that our method effectively enhances safety in both code execution and mobile device control tasks.
AGI Enabled Solutions For IoX Layers Bottlenecks In Cyber-Physical-Social-Thinking Space
Khelloufi, Amar, Ning, Huansheng, Dhelim, Sahraoui, Ding, Jianguo
The integration of the Internet of Everything (IoX) and Artificial General Intelligence (AGI) has given rise to a transformative paradigm aimed at addressing critical bottlenecks across sensing, network, and application layers in Cyber-Physical-Social Thinking (CPST) ecosystems. In this survey, we provide a systematic and comprehensive review of AGI-enhanced IoX research, focusing on three key components: sensing-layer data management, network-layer protocol optimization, and application-layer decision-making frameworks. Specifically, this survey explores how AGI can mitigate IoX bottlenecks challenges by leveraging adaptive sensor fusion, edge preprocessing, and selective attention mechanisms at the sensing layer, while resolving network-layer issues such as protocol heterogeneity and dynamic spectrum management, neuro-symbolic reasoning, active inference, and causal reasoning, Furthermore, the survey examines AGI-enabled frameworks for managing identity and relationship explosion. Key findings suggest that AGI-driven strategies, such as adaptive sensor fusion, edge preprocessing, and semantic modeling, offer novel solutions to sensing-layer data overload, network-layer protocol heterogeneity, and application-layer identity explosion. The survey underscores the importance of cross-layer integration, quantum-enabled communication, and ethical governance frameworks for future AGI-enabled IoX systems. Finally, the survey identifies unresolved challenges, such as computational requirements, scalability, and real-world validation, calling for further research to fully realize AGI's potential in addressing IoX bottlenecks. we believe AGI-enhanced IoX is emerging as a critical research field at the intersection of interconnected systems and advanced AI.
xChemAgents: Agentic AI for Explainable Quantum Chemistry
Polat, Can, Tuncel, Mehmet, Kurban, Mustafa, Serpedin, Erchin, Kurban, Hasan
Recent progress in multimodal graph neural networks has demonstrated that augmenting atomic XYZ geometries with textual chemical descriptors can enhance predictive accuracy across a range of electronic and thermodynamic properties. However, naively appending large sets of heterogeneous descriptors often degrades performance on tasks sensitive to molecular shape or symmetry, and undermines interpretability. xChemAgents proposes a cooperative agent framework that injects physics-aware reasoning into multimodal property prediction. xChemAgents comprises two language-model-based agents: a Selector, which adaptively identifies a sparse, weighted subset of descriptors relevant to each target, and provides a natural language rationale; and a Validator, which enforces physical constraints such as unit consistency and scaling laws through iterative dialogue. On standard benchmark datasets, xChemAgents achieves up to a 22% reduction in mean absolute error over the state-of-the-art baselines, while producing faithful, human-interpretable explanations. Experiment results highlight the potential of cooperative, self-verifying agents to enhance both accuracy and transparency in foundation-model-driven materials science. The implementation and accompanying dataset are available at https://github.com/KurbanIntelligenceLab/xChemAgents.
A Visualization Framework for Exploring Multi-Agent-Based Simulations Case Study of an Electric Vehicle Home Charging Ecosystem
Christensen, Kristoffer, Jørgensen, Bo Nørregaard, Ma, Zheng Grace
Multi-agent-based simulations (MABS) of electric vehicle (EV) home charging ecosystems generate large, complex, and stochastic time-series datasets that capture interactions between households, grid infrastructure, and energy markets. These interactions can lead to unexpected system-level events, such as transformer overloads or consumer dissatisfaction, that are difficult to detect and explain through static post-processing. This paper presents a modular, Python-based dashboard framework, built using Dash by Plotly, that enables efficient, multi-level exploration and root-cause analysis of emergent behavior in MABS outputs. The system features three coordinated views (System Overview, System Analysis, and Consumer Analysis), each offering high-resolution visualizations such as time-series plots, spatial heatmaps, and agent-specific drill-down tools. A case study simulating full EV adoption with smart charging in a Danish residential network demonstrates how the dashboard supports rapid identification and contextual explanation of anomalies, including clustered transformer overloads and time-dependent charging failures. The framework facilitates actionable insight generation for researchers and distribution system operators, and its architecture is adaptable to other distributed energy resources and complex energy systems.
Opinion Dynamics with Highly Oscillating Opinions
Vargas-Pérez, Víctor A., Giráldez-Cru, Jesús, Cordón, Oscar
Opinion Dynamics (OD) models are a particular case of Agent-Based Models in which the evolution of opinions within a population is studied. In most OD models, opinions evolve as a consequence of interactions between agents, and the opinion fusion rule defines how those opinions are updated. In consequence, despite being simplistic, OD models provide an explainable and interpretable mechanism for understanding the underlying dynamics of opinion evolution. Unfortunately, existing OD models mainly focus on explaining the evolution of (usually synthetic) opinions towards consensus, fragmentation, or polarization, but they usually fail to analyze scenarios of (real-world) highly oscillating opinions. This work overcomes this limitation by studying the ability of several OD models to reproduce highly oscillating dynamics. To this end, we formulate an optimization problem which is further solved using Evolutionary Algorithms, providing both quantitative results on the performance of the optimization and qualitative interpretations on the obtained results. Our experiments on a real-world opinion dataset about immigration from the monthly barometer of the Spanish Sociological Research Center show that the ATBCR, based on both rational and emotional mechanisms of opinion update, is the most accurate OD model for capturing highly oscillating opinions.