Goto

Collaborating Authors

 Agents


Beyond Frameworks: Unpacking Collaboration Strategies in Multi-Agent Systems

arXiv.org Artificial Intelligence

Multi-agent collaboration has emerged as a pivotal paradigm for addressing complex, distributed tasks in large language model (LLM)-driven applications. While prior research has focused on high-level architectural frameworks, the granular mechanisms governing agents, critical to performance and scalability, remain underexplored. This study systematically investigates four dimensions of collaboration strategies: (1) agent governance, (2) participation control, (3) interaction dynamics, and (4) dialogue history management. Through rigorous experimentation under two context-dependent scenarios: Distributed Evidence Integration (DEI) and Structured Evidence Synthesis (SES), we quantify the impact of these strategies on both task accuracy and computational efficiency. Our findings reveal that centralized governance, instructor-led participation, ordered interaction patterns, and instructor-curated context summarization collectively optimize the trade-off between decision quality and resource utilization with the support of the proposed Token-Accuracy Ratio (TAR). This work establishes a foundation for designing adaptive, scalable multi-agent systems, shifting the focus from structural novelty to strategic interaction mechanics.


BeliefNest: A Joint Action Simulator for Embodied Agents with Theory of Mind

arXiv.org Artificial Intelligence

Theory of Mind is a fundamental cognitive ability that underpins human social behavior, enabling individuals to infer the beliefs, intentions, and knowledge of others. In this paper, we propose BeliefNest, an open-source simulator designed to support research on collaborative behavior in embodied agents endowed with Theory of Mind capabilities. Recent advances in embodied agents powered by large language models (LLMs) have shown promising progress. However, there is still no platform that can explicitly represent nested belief states and integrate them with action generation mechanisms. BeliefNest addresses this gap by providing a flexible simulation framework that incorporates both hierarchical belief structures and prompt generation support. BeliefNest offers the following features: Explicit representation of nested belief states, as studied in Theory of Mind, using hierarchical simulators (see Section 3) Support for prompt generation based on each belief state, enabling the design and evaluation of methods for agent control with LLMs (see Section 5) Integration with the Minecraft environment, which is widely used in LLM agent research [1-4], and support for open-domain tasks In this paper, we describe the design and functionality of BeliefNest and demonstrate its effectiveness through experiments on false-belief tasks.


LAMeTA: Intent-Aware Agentic Network Optimization via a Large AI Model-Empowered Two-Stage Approach

arXiv.org Artificial Intelligence

--Nowadays, Generative AI (GenAI) reshapes numerous domains by enabling machines to create content across modalities. As GenAI evolves into autonomous agents capable of reasoning, collaboration, and interaction, they are increasingly deployed on network infrastructures to serve humans automatically. This emerging paradigm, known as the agentic network, presents new optimization challenges due to the demand to incorporate subjective intents of human users expressed in natural language. Traditional generic Deep Reinforcement Learning (DRL) struggles to capture intent semantics and adjust policies dynamically, thus leading to suboptimality. First, we propose Intent-oriented Knowledge Distillation (IoKD), which efficiently distills intent-understanding capabilities from resource-intensive LAMs to lightweight edge LAMs (E-LAMs) to serve end users. Second, we develop Symbiotic Reinforcement Learning (SRL), integrating E-LAMs with a policy-based DRL framework. In SRL, E-LAMs translate natural language user intents into structured preference vectors that guide both state representation and reward design. The DRL, in turn, optimizes the generative service function chain composition and E-LAM selection based on real-time network conditions, thus optimizing the subjective Quality-of-Experience (QoE). Extensive experiments conducted in an agentic network with 81 agents demonstrate that IoKD reduces mean squared error in intent prediction by up to 22.5%, while SRL outperforms conventional generic DRL by up to 23.5% in maximizing intent-aware QoE. Generative AI (GenAI) has revolutionized the technological landscape, enabling machines to create content across multiple modalities, including text, images, and videos [1]. Moreover, GenAI is rapidly evolving from basic content generation to complex reasoning and decision-making, transforming how machines interact with and serve humans. G. Sun is with the College of Computer Science and Technology, Jilin University, China, and also with the College of Computing and Data Science, Nanyang Technological University, Singapore (e-mail: sungeng@jlu.edu.cn).


CrafText Benchmark: Advancing Instruction Following in Complex Multimodal Open-Ended World

arXiv.org Artificial Intelligence

Following instructions in real-world conditions requires the ability to adapt to the world's volatility and entanglement: the environment is dynamic and unpredictable, instructions can be linguistically complex with diverse vocabulary, and the number of possible goals an agent may encounter is vast. Despite extensive research in this area, most studies are conducted in static environments with simple instructions and a limited vocabulary, making it difficult to assess agent performance in more diverse and challenging settings. To address this gap, we introduce CrafText, a benchmark for evaluating instruction following in a multimodal environment with diverse instructions and dynamic interactions. CrafText includes 3,924 instructions with 3,423 unique words, covering Localization, Conditional, Building, and Achievement tasks. Additionally, we propose an evaluation protocol that measures an agent's ability to generalize to novel instruction formulations and dynamically evolving task configurations, providing a rigorous test of both linguistic understanding and adaptive decision-making.


Position Paper: Bounded Alignment: What (Not) To Expect From AGI Agents

arXiv.org Artificial Intelligence

--The issues of AI risk and AI safety are becoming critical as the prospect of artificial general intelligence (AGI) looms larger . The emergence of extremely large and capable generative models has led to alarming predictions and created a stir from boardrooms to legislatures. As a result, AI alignment has emerged as one of the most important areas in AI research. The goal of this position paper is to argue that the currently dominant vision of AGI in the AI and machine learning (AI/ML) community needs to evolve, and that expectations and metrics for its safety must be informed much more by our understanding of the only existing instance of general intelligence, i.e., the intelligence found in animals, and especially in humans. This change in perspective will lead to a more realistic view of the technology, and allow for better policy decisions. The most successful AI systems today, such as large language models (LLMs) [1]-[5], are based on a computation-alist, statistical, and decision-theoretic paradigm rather than a biological one. As these systems scale up in size, they are improving their performance in areas such as reasoning [6]- [9], and becoming more multimodal [10]-[14]. AI agents [15]- [17], including physical ones [18]-[20], are also becoming increasingly capable. With these rapid advances, there is an expectation that powerful systems with artificial general intelligence (AGI) may soon be at hand. Through all this, there is a general desire that AGI must remain subject to human control and intervention, and must exist only to serve human needs (see, for example, the discussion in [21]). There is also great concern that increasingly powerful AGI systems with autonomous agency might pose serious risks, including existential ones [22]-[27], which has led to a focus on AI alignment, i.e., making AI systems consistent with human norms and preferences [28], [29]. The main position argued in this paper is that: 1) General intelligence should be seen in terms of its archetype: The intelligence of living agents; and 2) The goal of building powerful AGI agents is fundamentally inconsistent with the expectation of complete alignment or near-total control of AGI agents by humans even in principle .


Decentralized Traffic Flow Optimization Through Intrinsic Motivation

arXiv.org Artificial Intelligence

Traffic congestion has long been an ubiquitous problem that is exacerbating with the rapid growth of megacities. In this proof-of-concept work we study intrinsic motivation, implemented via the empowerment principle, to control autonomous car behavior to improve traffic flow. In standard models of traffic dynamics, self-organized traffic jams emerge spontaneously from the individual behavior of cars, affecting traffic over long distances. Our novel car behavior strategy improves traffic flow while still being decentralized and using only locally available information without explicit coordination. Decentralization is essential for various reasons, not least to be able to absorb robustly substantial levels of uncertainty. Our scenario is based on the well-established traffic dynamics model, the Nagel-Schreckenberg cellular automaton. In a fraction of the cars in this model, we substitute the default behavior by empowerment, our intrinsic motivation-based method. This proposed model significantly improves overall traffic flow, mitigates congestion, and reduces the average traffic jam time.


Think Twice Before You Act: Enhancing Agent Behavioral Safety with Thought Correction

arXiv.org Artificial Intelligence

LLM-based autonomous agents possess capabilities such as reasoning, tool invocation, and environment interaction, enabling the execution of complex multi-step tasks. The internal reasoning process, i.e., thought, of behavioral trajectory significantly influences tool usage and subsequent actions but can introduce potential risks. Even minor deviations in the agent's thought may trigger cascading effects leading to irreversible safety incidents. To address the safety alignment challenges in long-horizon behavioral trajectories, we propose Thought-Aligner, a plug-in dynamic thought correction module. Utilizing a lightweight and resource-efficient model, Thought-Aligner corrects each high-risk thought on the fly before each action execution. The corrected thought is then reintroduced to the agent, ensuring safer subsequent decisions and tool interactions. Importantly, Thought-Aligner modifies only the reasoning phase without altering the underlying agent framework, making it easy to deploy and widely applicable to various agent frameworks. To train the Thought-Aligner model, we construct an instruction dataset across ten representative scenarios and simulate ReAct execution trajectories, generating 5,000 diverse instructions and more than 11,400 safe and unsafe thought pairs. The model is fine-tuned using contrastive learning techniques. Experiments across three agent safety benchmarks involving 12 different LLMs demonstrate that Thought-Aligner raises agent behavioral safety from approximately 50% in the unprotected setting to 90% on average. Additionally, Thought-Aligner maintains response latency below 100ms with minimal resource usage, demonstrating its capability for efficient deployment, broad applicability, and timely responsiveness. This method thus provides a practical dynamic safety solution for the LLM-based agents.


Towards Multi-Agent Reasoning Systems for Collaborative Expertise Delegation: An Exploratory Design Study

arXiv.org Artificial Intelligence

Designing effective collaboration structure for multi-agent LLM systems to enhance collective reasoning is crucial yet remains under-explored. In this paper, we systematically investigate how collaborative reasoning performance is affected by three key design dimensions: (1) Expertise-Domain Alignment, (2) Collaboration Paradigm (structured workflow vs. diversity-driven integration), and (3) System Scale. Our findings reveal that expertise alignment benefits are highly domain-contingent, proving most effective for contextual reasoning tasks. Furthermore, collaboration focused on integrating diverse knowledge consistently outperforms rigid task decomposition. Finally, we empirically explore the impact of scaling the multi-agent system with expertise specialization and study the computational trade off, highlighting the need for more efficient communication protocol design. This work provides concrete guidelines for configuring specialized multi-agent system and identifies critical architectural trade-offs and bottlenecks for scalable multi-agent reasoning. The code will be made available upon acceptance.


Bidirectional Distillation: A Mixed-Play Framework for Multi-Agent Generalizable Behaviors

arXiv.org Artificial Intelligence

Population-population generalization is a challenging problem in multi-agent reinforcement learning (MARL), particularly when agents encounter unseen co-players. However, existing self-play-based methods are constrained by the limitation of inside-space generalization. In this study, we propose Bidirectional Distillation (BiDist), a novel mixed-play framework, to overcome this limitation in MARL. BiDist leverages knowledge distillation in two alternating directions: forward distillation, which emulates the historical policies' space and creates an implicit self-play, and reverse distillation, which systematically drives agents towards novel distributions outside the known policy space in a non-self-play manner. In addition, BiDist operates as a concise and efficient solution without the need for the complex and costly storage of past policies. We provide both theoretical analysis and empirical evidence to support BiDist's effectiveness. Our results highlight its remarkable generalization ability across a variety of cooperative, competitive, and social dilemma tasks, and reveal that BiDist significantly diversifies the policy distribution space. We also present comprehensive ablation studies to reinforce BiDist's effectiveness and key success factors. Source codes are available in the supplementary material.


Diffusion Learning with Partial Agent Participation and Local Updates

arXiv.org Artificial Intelligence

Diffusion learning is a framework that endows edge devices with advanced intelligence. By processing and analyzing data locally and allowing each agent to communicate with its immediate neighbors, diffusion effectively protects the privacy of edge devices, enables real-time response, and reduces reliance on central servers. However, traditional diffusion learning relies on communication at every iteration, leading to communication overhead, especially with large learning models. Furthermore, the inherent volatility of edge devices, stemming from power outages or signal loss, poses challenges to reliable communication between neighboring agents. To mitigate these issues, this paper investigates an enhanced diffusion learning approach incorporating local updates and partial agent participation. Local updates will curtail communication frequency, while partial agent participation will allow for the inclusion of agents based on their availability. We prove that the resulting algorithm is stable in the mean-square error sense and provide a tight analysis of its Mean-Square-Deviation (MSD) performance. Various numerical experiments are conducted to illustrate our theoretical findings.