Agents
Simulating Persuasive Dialogues on Meat Reduction with Generative Agents
Ahnert, Georg, Wurth, Elena, Strohmaier, Markus, Mata, Jutta
Meat reduction benefits human and planetary health, but social norms keep meat central in shared meals. To date, the development of communication strategies that promote meat reduction while minimizing social costs has required the costly involvement of human participants at each stage of the process. We present work in progress on simulating multi-round dialogues on meat reduction between Generative Agents based on large language models (LLMs). We measure our main outcome using established psychological questionnaires based on the Theory of Planned Behavior and additionally investigate Social Costs. We find evidence that our preliminary simulations produce outcomes that are (i) consistent with theoretical expectations; and (ii) valid when compared to data from previous studies with human participants. Generative agent-based models are a promising tool for identifying novel communication strategies on meat reduction -- tailored to highly specific participant groups -- to then be tested in subsequent studies with human participants.
Modeling AI-Driven Production and Competitiveness A Multi-Agent Economic Simulation of China and the United States
MODELING AI-DRIVEN PRODUCTION AND COMPETITIVENESS: A MUL TI-AGENT ECONOMIC SIMULA TION OF CHINA AND THE UNITED ST A TES Y uxinyue Qian, Jun Liu Beijing University of Posts and Telecommunications liujun@bupt.edu.cn ABSTRACT With the rapid development of artificial intelligence (AI) technology, socio-economic systems are entering a new stage of "human-AI co-creation." Building upon a previously established multi-level intelligent agent economic model, this paper conducts simulation-based comparisons of macroeconomic output evolution in China and the United States under different mechanisms--AI collaboration, network effects, and AI autonomous production. The results show that: (1) when AI functions as an independent productive entity, the overall growth rate of social output far exceeds that of traditional human-labor-based models; (2) China demonstrates clear potential for acceleration in both the expansion of intelligent agent populations and the pace of technological catch-up, offering the possibility of achieving technological convergence or even partial surpassing. This study provides a systematic, model-based analytical framework for understanding AI-driven production system transformation and shifts in international competitiveness, as well as quantitative insights for relevant policy formulation. Comparison 1. INTRODUCTION Since the beginning of the 21st century, the rapid evolution of generative artificial intelligence (AI) and autonomous intelligent agents (AI agents) has profoundly reshaped the operating mechanisms of socioeconomic systems. Overall, the United States maintains a significant lead in core model development and capital investment.
Flow Matching-Based Autonomous Driving Planning with Advanced Interactive Behavior Modeling
Tan, Tianyi, Zheng, Yinan, Liang, Ruiming, Wang, Zexu, Zheng, Kexin, Zheng, Jinliang, Li, Jianxiong, Zhan, Xianyuan, Liu, Jingjing
Modeling interactive driving behaviors in complex scenarios remains a fundamental challenge for autonomous driving planning. Learning-based approaches attempt to address this challenge with advanced generative models, removing the dependency on over-engineered architectures for representation fusion. However, brute-force implementation by simply stacking transformer blocks lacks a dedicated mechanism for modeling interactive behaviors that are common in real driving scenarios. The scarcity of interactive driving data further exacerbates this problem, leaving conventional imitation learning methods ill-equipped to capture high-value interactive behaviors. We propose Flow Planner, which tackles these problems through coordinated innovations in data modeling, model architecture, and learning scheme. Specifically, we first introduce fine-grained trajectory tokenization, which decomposes the trajectory into overlapping segments to decrease the complexity of whole trajectory modeling. With a sophisticatedly designed architecture, we achieve efficient temporal and spatial fusion of planning and scene information, to better capture interactive behaviors. In addition, the framework incorporates flow matching with classifier-free guidance for multi-modal behavior generation, which dynamically reweights agent interactions during inference to maintain coherent response strategies, providing a critical boost for interactive scenario understanding. Experimental results on the large-scale nuPlan dataset and challenging interactive interPlan dataset demonstrate that Flow Planner achieves state-of-the-art performance among learning-based approaches while effectively modeling interactive behaviors in complex driving scenarios.
Automating Structural Engineering Workflows with Large Language Model Agents
Liang, Haoran, Zhou, Yufa, Kalaleh, Mohammad Talebi, Mei, Qipei
We introduce $\textbf{MASSE}$, the first Multi-Agent System for Structural Engineering, effectively integrating large language model (LLM)-based agents with real-world engineering workflows. Structural engineering is a fundamental yet traditionally stagnant domain, with core workflows remaining largely unchanged for decades despite its substantial economic impact and global market size. Recent advancements in LLMs have significantly enhanced their ability to perform complex reasoning, long-horizon planning, and precise tool utilization -- capabilities well aligned with structural engineering tasks such as interpreting design codes, executing load calculations, and verifying structural capacities. We present a proof-of-concept showing that most real-world structural engineering workflows can be fully automated through a training-free LLM-based multi-agent system. MASSE enables immediate deployment in professional environments, and our comprehensive validation on real-world case studies demonstrates that it can reduce expert workload from approximately two hours to mere minutes, while enhancing both reliability and accuracy in practical engineering scenarios.
A Survey on Agentic Multimodal Large Language Models
Yao, Huanjin, Zhang, Ruifei, Huang, Jiaxing, Zhang, Jingyi, Wang, Yibo, Fang, Bo, Zhu, Ruolin, Jing, Yongcheng, Liu, Shunyu, Li, Guanbin, Tao, Dacheng
With the recent emergence of revolutionary autonomous agentic systems, research community is witnessing a significant shift from traditional static, passive, and domain-specific AI agents toward more dynamic, proactive, and generalizable agentic AI. Motivated by the growing interest in agentic AI and its potential trajectory toward AGI, we present a comprehensive survey on Agentic Multimodal Large Language Models (Agentic MLLMs). In this survey, we explore the emerging paradigm of agentic MLLMs, delineating their conceptual foundations and distinguishing characteristics from conventional MLLM-based agents. We establish a conceptual framework that organizes agentic MLLMs along three fundamental dimensions: (i) Agentic internal intelligence functions as the system's commander, enabling accurate long-horizon planning through reasoning, reflection, and memory; (ii) Agentic external tool invocation, whereby models proactively use various external tools to extend their problem-solving capabilities beyond their intrinsic knowledge; and (iii) Agentic environment interaction further situates models within virtual or physical environments, allowing them to take actions, adapt strategies, and sustain goal-directed behavior in dynamic real-world scenarios. To further accelerate research in this area for the community, we compile open-source training frameworks, training and evaluation datasets for developing agentic MLLMs. Finally, we review the downstream applications of agentic MLLMs and outline future research directions for this rapidly evolving field. To continuously track developments in this rapidly evolving field, we will also actively update a public repository at https://github.com/HJYao00/Awesome-Agentic-MLLMs.
Game-Theoretic Risk-Shaped Reinforcement Learning for Safe Autonomous Driving
Hu, Dong, Hu, Fenqing, Yang, Lidong, Huang, Chao
Ensuring safety in autonomous driving (AD) remains a significant challenge, especially in highly dynamic and complex traffic environments where diverse agents interact and unexpected hazards frequently emerge. Traditional reinforcement learning (RL) methods often struggle to balance safety, efficiency, and adaptability, as they primarily focus on reward maximization without explicitly modeling risk or safety constraints. To address these limitations, this study proposes a novel game-theoretic risk-shaped RL (GTR2L) framework for safe AD. GTR2L incorporates a multi-level game-theoretic world model that jointly predicts the interactive behaviors of surrounding vehicles and their associated risks, along with an adaptive rollout horizon that adjusts dynamically based on predictive uncertainty. Furthermore, an uncertainty-aware barrier mechanism enables flexible modulation of safety boundaries. A dedicated risk modeling approach is also proposed, explicitly capturing both epistemic and aleatoric uncertainty to guide constrained policy optimization and enhance decision-making in complex environments. Extensive evaluations across diverse and safety-critical traffic scenarios show that GTR2L significantly outperforms state-of-the-art baselines, including human drivers, in terms of success rate, collision and violation reduction, and driving efficiency. The code is available at https://github.com/DanielHu197/GTR2L.
The Social Cost of Intelligence: Emergence, Propagation, and Amplification of Stereotypical Bias in Multi-Agent Systems
Nguyen, Thi-Nhung, Luo, Linhao, Vu, Thuy-Trang, Phung, Dinh
Bias in large language models (LLMs) remains a persistent challenge, manifesting in stereotyping and unfair treatment across social groups. While prior research has primarily focused on individual models, the rise of multi-agent systems (MAS), where multiple LLMs collaborate and communicate, introduces new and largely unexplored dynamics in bias emergence and propagation. In this work, we present a comprehensive study of stereotypical bias in MAS, examining how internal specialization, underlying LLMs and inter-agent communication protocols influence bias robustness, propagation, and amplification. We simulate social contexts where agents represent different social groups and evaluate system behavior under various interaction and adversarial scenarios. Experiments on three bias benchmarks reveal that MAS are generally less robust than single-agent systems, with bias often emerging early through in-group favoritism. However, cooperative and debate-based communication can mitigate bias amplification, while more robust underlying LLMs improve overall system stability. Our findings highlight critical factors shaping fairness and resilience in multi-agent LLM systems.
Neutral Agent-based Adversarial Policy Learning against Deep Reinforcement Learning in Multi-party Open Systems
Peng, Qizhou, Zheng, Yang, Wen, Yu, Wu, Yanna, Du, Yingying
Reinforcement learning (RL) has been an important machine learning paradigm for solving long-horizon sequential decision-making problems under uncertainty. By integrating deep neural networks (DNNs) into the RL framework, deep reinforcement learning (DRL) has emerged, which achieved significant success in various domains. However, the integration of DNNs also makes it vulnerable to adversarial attacks. Existing adversarial attack techniques mainly focus on either directly manipulating the environment with which a victim agent interacts or deploying an adversarial agent that interacts with the victim agent to induce abnormal behaviors. While these techniques achieve promising results, their adoption in multi-party open systems remains limited due to two major reasons: impractical assumption of full control over the environment and dependent on interactions with victim agents. To enable adversarial attacks in multi-party open systems, in this paper, we redesigned an adversarial policy learning approach that can mislead well-trained victim agents without requiring direct interactions with these agents or full control over their environments. Particularly, we propose a neutral agent-based approach across various task scenarios in multi-party open systems. While the neutral agents seemingly are detached from the victim agents, indirectly influence them through the shared environment. We evaluate our proposed method on the SMAC platform based on Starcraft II and the autonomous driving simulation platform Highway-env. The experimental results demonstrate that our method can launch general and effective adversarial attacks in multi-party open systems.
Fast and the Furious: Hot Starts in Pursuit-Evasion Games
Smithline, Gabriel, Nivison, Scott
Effectively positioning pursuers in pursuit-evasion games without prior knowledge of evader locations remains a significant challenge. A novel approach that combines game-theoretic control theory with Graph Neural Networks is introduced in this work. By conceptualizing pursuer configurations as strategic arrangements and representing them as graphs, a Graph Characteristic Space is constructed via multi-objective optimization to identify Pareto-optimal configurations. A Graph Convolutional Network (GCN) is trained on these Pareto-optimal graphs to generate strategically effective initial configurations, termed "hot starts". Empirical evaluations demonstrate that the GCN-generated hot starts provide a significant advantage over random configurations. In scenarios considering multiple pursuers and evaders, this method hastens the decline in evader survival rates, reduces pursuer travel distances, and enhances containment, showcasing clear strategic benefits.
Agentic RAG for Software Testing with Hybrid Vector-Graph and Multi-Agent Orchestration
Hariharan, Mohanakrishnan, Arvapalli, Satish, Barma, Seshu, Sheela, Evangeline
-- W e present a n approach to software testing automation using Agentic Retrieval - Augmented Generation (RAG) systems for Quality Engineering (QE) artifact creation. We combine autonomous AI agents with hybrid vector - graph knowledge systems to automate test plan, case, and Q E metric generation. The system achieves remarkable accuracy improvements from 65% to 94.8% while ensuring comprehensive document traceability throughout the quality engineering lifecycle. Experimental validat ion of enterprise Corporate Systems Engineering and SAP migration projects demonstrates an 85% reduction in testing timeline, a n 85% improvement in test suite efficiency, and projected 35% cost savings, resulting in a 2 - month acceleration of go - live . Index Terms -- agentic systems, retrieval - augmented generation, software testing, quality engineering, multi - agent orchestration, hybrid vector - graph, test automation, SAP testing, en terprise systems These limitations become particularly pronounced in enterprise software testing, where maintaining traceability between requirements, test cases, and business logic is paramount for regulatory compliance and quality assurance.