Agents
Simulation of Autonomous Industrial Vehicle Fleet Using Fuzzy Agents: Application to Task Allocation and Battery Charge Management
Grosset, Juliette, Fougères, Alain-Jérôme, Oukacha, Ouzna, Djoko-Kouam, Moïse, Bonnin, Jean-Marie
Abstract: The research introduces a multi - agent simulation that uses fuzzy inference to investigate the work distribution and battery charging control of mobile baggage conveyor robots in an airport in a comprehensive manner. Thanks to a distributed system, this simulation approach provides high adaptability, adjusting to changes in conveyor agent availability, battery capacity, awareness of the activities of the conveyor fleet, and knowledge of the context of infrastructure resource availability. Dynamic factors, such as workload variations and communication between the conveyor agents and infrastructure are con sidered as heuristics, hig hlighting the importance of flexible and collaborative approaches in autonomous systems. The results highlight the effectiveness of adaptive fuzzy multi - agent models to optimize dynamic task allocation, adapt to the variation of baggage arrival flows, impr ove the overall operational efficiency of conveyor agents, and reduce their energy consumption. Keywords: autonomous industrial vehicle, agent - based si mulation, fuzzy agent, dynamic task allocation, battery charge management, Airport 4.0 1. INTRODUCTION The implementation of fleets of Autonomous Industrial Vehicles (AIV) in the context of Airport 4.0 presents a number of challenges, all of which are connected to the true degree of autonomy of these vehicles: employee acceptance, vehicle localization, traf fic flow, failure detection, collision avoidance, and vehicle perception in dynamic environments. The different limitations and specifications developed by producers and potential consumers of these AIVs might be taken into consideration thanks to simulati on.
AgentNet: Decentralized Evolutionary Coordination for LLM-based Multi-Agent Systems
Yang, Yingxuan, Chai, Huacan, Shao, Shuai, Song, Yuanyi, Qi, Siyuan, Rui, Renting, Zhang, Weinan
The rapid advancement of Large Language Models (LLMs) has catalyzed the development of multi-agent systems, where multiple LLM-based agents collaborate to solve complex tasks. However, existing systems predominantly rely on centralized coordination, which introduces scalability bottlenecks, limits adaptability, and creates single points of failure. Additionally, concerns over privacy and proprietary knowledge sharing hinder cross-organizational collaboration, leading to siloed expertise. To address these challenges, we propose AgentNet, a decentralized, Retrieval-Augmented Generation (RAG)-based framework that enables LLM-based agents to autonomously evolve their capabilities and collaborate efficiently in a Directed Acyclic Graph (DAG)-structured network. Unlike traditional multi-agent systems that depend on static role assignments or centralized control, AgentNet allows agents to specialize dynamically, adjust their connectivity, and route tasks without relying on predefined workflows. AgentNet's core design is built upon several key innovations: (1) Fully Decentralized Paradigm: Removing the central orchestrator, allowing agents to coordinate and specialize autonomously, fostering fault tolerance and emergent collective intelligence. (2) Dynamically Evolving Graph Topology: Real-time adaptation of agent connections based on task demands, ensuring scalability and resilience.(3) Adaptive Learning for Expertise Refinement: A retrieval-based memory system that enables agents to continuously update and refine their specialized skills. By eliminating centralized control, AgentNet enhances fault tolerance, promotes scalable specialization, and enables privacy-preserving collaboration across organizations. Through decentralized coordination and minimal data exchange, agents can leverage diverse knowledge sources while safeguarding sensitive information.
Off-Policy Evaluation for Sequential Persuasion Process with Unobserved Confounding
S., Nishanth Venkatesh, Bang, Heeseung, Malikopoulos, Andreas A.
-- In this paper, we expand the Bayesian persuasion framework to account for unobserved confounding variables in sender-receiver interactions. While traditional models typically assume that belief updates follow Bayesian principles, real-world scenarios often involve hidden variables that impact the receiver's belief formation and decision-making. Crucially, the receiver's belief update is affected by an unobserved confounding variable. By reformulating this scenario as a Partially Observable Markov Decision Process (POMDP), we capture the sender's incomplete information regarding both the dynamics of the receiver's beliefs and the unobserved confounder . We prove that finding an optimal observation-based policy in this POMDP is equivalent to solving for an optimal signaling strategy in the original persuasion framework. Furthermore, we demonstrate how this reformulation facilitates the application of proximal learning for off-policy evaluation (OPE) in the persuasion process. This advancement enables the sender to evaluate alternative signaling strategies using only observational data from a behavioral policy, thus eliminating the necessity for costly new experiments. Strategic information sharing plays a critical role in economic interactions, policy design, and multi-agent systems [1]-[3]. Bayesian persuasion was first introduced by Ka-menica and Gentzkow [4] as a powerful framework for analyzing how a sender can strategically reveal information to influence a receiver's decisions. In the standard setting, a sender commits to an information disclosure policy before observing the state of the world, and the receiver, after observing the sender's message, forms posterior beliefs and takes an action that affects both the sender's and the receiver's utilities. Despite its theoretical elegance, Bayesian persuasion rests on assumptions that may not hold in practical settings. First, the framework presupposes that the sender possesses complete information about the receiver, including their observation process and all features that influence their decision-making (including utility functions).
GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments
Du, Enjun, Li, Xunkai, Jin, Tian, Zhang, Zhihan, Li, Rong-Hua, Wang, Guoren
The era of foundation models has revolutionized AI research, yet Graph Foundation Models (GFMs) remain constrained by the scarcity of large-scale graph corpora. Traditional graph data synthesis techniques primarily focus on simplistic structural operations, lacking the capacity to generate semantically rich nodes with meaningful textual attributes: a critical limitation for real-world applications. While large language models (LLMs) demonstrate exceptional text generation capabilities, their direct application to graph synthesis is impeded by context window limitations, hallucination phenomena, and structural consistency challenges. To address these issues, we introduce GraphMaster, the first multi-agent framework specifically designed for graph data synthesis in data-limited environments. GraphMaster orchestrates four specialized LLM agents (Manager, Perception, Enhancement, and Evaluation) that collaboratively optimize the synthesis process through iterative refinement, ensuring both semantic coherence and structural integrity. To rigorously evaluate our approach, we create new data-limited "Sub" variants of six standard graph benchmarks, specifically designed to test synthesis capabilities under realistic constraints. Additionally, we develop a novel interpretability assessment framework that combines human evaluation with a principled Grassmannian manifold-based analysis, providing both qualitative and quantitative measures of semantic coherence. Experimental results demonstrate that GraphMaster significantly outperforms traditional synthesis methods across multiple datasets, establishing a strong foundation for advancing GFMs in data-scarce environments.
Egocentric Conformal Prediction for Safe and Efficient Navigation in Dynamic Cluttered Environments
Shin, Jaeuk, Lee, Jungjin, Yang, Insoon
Since safe control of ego-vehicles depends on accurately predicting the future states of surrounding dynamic agents, numerous motion forecasting models [1, 2] have been developed to forecast an agent's future motions from historical data. Nevertheless, these predictions remain inherently prone to error, primarily because they lack information about hidden contexts or intents--such as agents' goals, velocity preferences, or even social relationships among human agents. To address these limitations, conformal prediction (CP) [3, 4] has been employed to reliably assess the models' predictive capabilities. The method offers a principled yet straightforward procedure for calibrating the models. At test time, the calibration results can be used to construct a confidence set that contains the true future states of the environment, assuming that the test and calibration data are exchangeable (i.e., their joint distribution is symmetric). Consequently, CP has been successfully applied to a variety of problems, including reinforcement learning [5, 6], linear This work was supported in part by the Information and Communications Technology Planning and Evaluation (IITP) grants funded by MSIT No. 2022-0-00124, No. 2022-0-00480 and No. RS-2021-II211343, Artificial Intelligence Graduate School Program (Seoul National University). The authors are with the Department of Electrical and Computer Engineering, ASRI, Seoul National University, Seoul 08826, South Korea,{sju5379, jungbbal, insoonyang }@snu.ac.kr arXiv:2504.00447v1
Value Iteration for Learning Concurrently Executable Robotic Control Tasks
Tahmid, Sheikh A., Notomista, Gennaro
Many modern robotic systems such as multi-robot systems and manipulators exhibit redundancy, a property owing to which they are capable of executing multiple tasks. This work proposes a novel method, based on the Reinforcement Learning (RL) paradigm, to train redundant robots to be able to execute multiple tasks concurrently. Our approach differs from typical multi-objective RL methods insofar as the learned tasks can be combined and executed in possibly time-varying prioritized stacks. We do so by first defining a notion of task independence between learned value functions. We then use our definition of task independence to propose a cost functional that encourages a policy, based on an approximated value function, to accomplish its control objective while minimally interfering with the execution of higher priority tasks. This allows us to train a set of control policies that can be executed simultaneously. We also introduce a version of fitted value iteration to learn to approximate our proposed cost functional efficiently. We demonstrate our approach on several scenarios and robotic systems.
Amazon's AGI Lab Reveals Its First Work: Advanced AI Agents
Amazon is still seen as a bit of a laggard in the race to develop advanced artificial intelligence, but it has quietly created a lab that is now setting records when it comes to AI performance. Amazon's AGI SF Lab, which is located in San Francisco and dedicated to building artificial general intelligence, or AI that surpasses the capabilities of humans, revealed the first fruits of its work today: A new AI model capable of powering some of the most advanced AI agents available anywhere. The new model, called Amazon Nova Act, outperforms ones from OpenAI and Anthropic on several benchmarks designed to gauge the intelligence and aptitude of AI agents, Amazon says. On the benchmarks GroundUI Web and ScreenSpot, Amazon Nova Act performs better than Claude 3.7 Sonnet and OpenAI Computer Use Agent. A major part of Amazon's plan to compete in the AI market is to focus on building agents, and the new model's abilities reflect its efforts to build a generation of tools that can measure up to the very best available.
$\textit{Agents Under Siege}$: Breaking Pragmatic Multi-Agent LLM Systems with Optimized Prompt Attacks
Khan, Rana Muhammad Shahroz, Tan, Zhen, Yun, Sukwon, Flemming, Charles, Chen, Tianlong
Most discussions about Large Language Model (LLM) safety have focused on single-agent settings but multi-agent LLM systems now create novel adversarial risks because their behavior depends on communication between agents and decentralized reasoning. In this work, we innovatively focus on attacking pragmatic systems that have constrains such as limited token bandwidth, latency between message delivery, and defense mechanisms. We design a $\textit{permutation-invariant adversarial attack}$ that optimizes prompt distribution across latency and bandwidth-constraint network topologies to bypass distributed safety mechanisms within the system. Formulating the attack path as a problem of $\textit{maximum-flow minimum-cost}$, coupled with the novel $\textit{Permutation-Invariant Evasion Loss (PIEL)}$, we leverage graph-based optimization to maximize attack success rate while minimizing detection risk. Evaluating across models including $\texttt{Llama}$, $\texttt{Mistral}$, $\texttt{Gemma}$, $\texttt{DeepSeek}$ and other variants on various datasets like $\texttt{JailBreakBench}$ and $\texttt{AdversarialBench}$, our method outperforms conventional attacks by up to $7\times$, exposing critical vulnerabilities in multi-agent systems. Moreover, we demonstrate that existing defenses, including variants of $\texttt{Llama-Guard}$ and $\texttt{PromptGuard}$, fail to prohibit our attack, emphasizing the urgent need for multi-agent specific safety mechanisms.
PAARS: Persona Aligned Agentic Retail Shoppers
Mansour, Saab, Perelli, Leonardo, Mainetti, Lorenzo, Davidson, George, D'Amato, Stefano
In e-commerce, behavioral data is collected for decision making which can be costly and slow. Simulation with LLM powered agents is emerging as a promising alternative for representing human population behavior. However, LLMs are known to exhibit certain biases, such as brand bias, review rating bias and limited representation of certain groups in the population, hence they need to be carefully benchmarked and aligned to user behavior. Ultimately, our goal is to synthesise an agent population and verify that it collectively approximates a real sample of humans. To this end, we propose a framework that: (i) creates synthetic shopping agents by automatically mining personas from anonymised historical shopping data, (ii) equips agents with retail-specific tools to synthesise shopping sessions and (iii) introduces a novel alignment suite measuring distributional differences between humans and shopping agents at the group (i.e. population) level rather than the traditional "individual" level. Experimental results demonstrate that using personas improves performance on the alignment suite, though a gap remains to human behaviour. We showcase an initial application of our framework for automated agentic A/B testing and compare the findings to human results. Finally, we discuss applications, limitations and challenges setting the stage for impactful future work.
UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving
Wang, Yuping, Huang, Xiangyu, Sun, Xiaokang, Yan, Mingxuan, Xing, Shuo, Tu, Zhengzhong, Li, Jiachen
We introduce UniOcc, a comprehensive, unified benchmark for occupancy forecasting (i.e., predicting future occupancies based on historical information) and current-frame occupancy prediction from camera images. UniOcc unifies data from multiple real-world datasets (i.e., nuScenes, Waymo) and high-fidelity driving simulators (i.e., CARLA, OpenCOOD), which provides 2D/3D occupancy labels with per-voxel flow annotations and support for cooperative autonomous driving. In terms of evaluation, unlike existing studies that rely on suboptimal pseudo labels for evaluation, UniOcc incorporates novel metrics that do not depend on ground-truth occupancy, enabling robust assessment of additional aspects of occupancy quality. Through extensive experiments on state-of-the-art models, we demonstrate that large-scale, diverse training data and explicit flow information significantly enhance occupancy prediction and forecasting performance.