Agents
A novel strategy for multi-resource load balancing in agent-based systems
Sliwko, Leszek, Zgrzywa, Aleksander
The paper presents a multi-resource load balancing strategy which can be utilised within an agent-based system. This approach can assist system designers in their attempts to optimise the structure for complex enterprise architectures. In this system, the social behaviour of the agent and its adaptation abilities are applied to determine an optimal setup for a given configuration. All the methods have been developed to allow the agent's self-assessment. The proposed agent system has been implemented and the experiment results are presented here.
An improved clustering-based multi-swarm PSO using local diversification and topology information
Matanga, Yves, Sun, Yanxia, Wang, Zenghui
Multi-swarm particle optimisation algorithms are gaining popularity due to their ability to locate multiple optimum points concurrently. In this family of algorithms, clustering-based multi-swarm algorithms are among the most effective techniques that join the closest particles together to form independent niche swarms that exploit potential promising regions. However, most clustering-based multi-swarms are Euclidean distance-based and only inquire about the potential of one peak within a cluster and thus can lose multiple peaks due to poor resolution. In a bid to improve the peak detection ratio, the current study proposes two enhancements. First, a preliminary local search across initial particles is proposed to ensure that each local region is sufficiently scouted prior to particle collaboration. Secondly, an investigative clustering approach that performs concavity analysis is proposed to evaluate the potential for several sub-niches within a single cluster. An improved clustering-based multi-swarm PSO (TImPSO) has resulted from these enhancements and has been tested against three competing algorithms in the same family using the IEEE CEC2013 niching datasets, resulting in an improved peak ratio for almost all the test functions.
Generative Caching for Structurally Similar Prompts and Responses
Chakraborty, Sarthak, Nath, Suman, Zhang, Xuchao, Bansal, Chetan, Gupta, Indranil
Large Language Models (LLMs) are increasingly being used to plan, reason, and execute tasks across diverse scenarios. In use cases like repeatable workflows and agentic settings, prompts are often reused with minor variations while having a similar structure for recurring tasks. This opens up opportunities for caching. However, exact prompt matching fails on such structurally similar prompts, while semantic caching may produce incorrect responses by ignoring critical differences. To address this, we introduce \ourmethod{}, a generative cache that produces variation-aware responses for structurally similar prompts. \ourmethod{} identifies reusable response patterns across similar prompt structures and synthesizes customized outputs for new requests. We show that \ourmethod{} achieves 83\% cache hit rate, while having minimal incorrect hits on datasets without prompt repetition. In agentic workflows, it improves cache hit rate by $\sim$20\% and reduces end-to-end execution latency by $\sim$34\% compared to standard prompt matching.
Hiding in the AI Traffic: Abusing MCP for LLM-Powered Agentic Red Teaming
Janjusevic, Strahinja, Garcia, Anna Baron, Kazerounian, Sohrob
Generative AI is reshaping offensive cybersecurity by enabling autonomous red team agents that can plan, execute, and adapt during penetration tests. However, existing approaches face trade-offs between generality and specialization, and practical deployments reveal challenges such as hallucinations, context limitations, and ethical concerns. In this work, we introduce a novel command & control (C2) architecture leveraging the Model Context Protocol (MCP) to coordinate distributed, adaptive reconnaissance agents covertly across networks. Notably, we find that our architecture not only improves goal-directed behavior of the system as whole, but also eliminates key host and network artifacts that can be used to detect and prevent command & control behavior altogether. We begin with a comprehensive review of state-of-the-art generative red teaming methods, from fine-tuned specialist models to modular or agentic frameworks, analyzing their automation capabilities against task-specific accuracy. We then detail how our MCP-based C2 can overcome current limitations by enabling asynchronous, parallel operations and real-time intelligence sharing without periodic beaconing. We furthermore explore advanced adversarial capabilities of this architecture, its detection-evasion techniques, and address dual-use ethical implications, proposing defensive measures and controlled evaluation in lab settings. Experimental comparisons with traditional C2 show drastic reductions in manual effort and detection footprint. We conclude with future directions for integrating autonomous exploitation, defensive LLM agents, predictive evasive maneuvers, and multi-agent swarms. The proposed MCP-enabled C2 framework demonstrates a significant step toward realistic, AI-driven red team operations that can simulate advanced persistent threats while informing the development of next-generation defensive systems.
DeepFleet: Multi-Agent Foundation Models for Mobile Robots
Agaskar, Ameya, Siva, Sriram, Pickering, William, O'Brien, Kyle, Kekeh, Charles, Li, Ang, Sarker, Brianna Gallo, Chua, Alicia, Nemade, Mayur, Thattai, Charun, Di, Jiaming, Iyengar, Isaac, Dharoor, Ramya, Kirouani, Dino, Erskine, Jimmy, Hegazy, Tamir, Niekum, Scott, Khan, Usman A., Pecora, Federico, Durham, Joseph W.
We introduce DeepFleet, a suite of foundation models designed to support coordination and planning for large-scale mobile robot fleets. These models are trained on fleet movement data, including robot positions, goals, and interactions, from hundreds of thousands of robots in Amazon warehouses worldwide. DeepFleet consists of four architectures that each embody a distinct inductive bias and collectively explore key points in the design space for multi-agent foundation models: the robot-centric (RC) model is an autoregressive decision transformer operating on neighborhoods of individual robots; the robot-floor (RF) model uses a transformer with cross-attention between robots and the warehouse floor; the image-floor (IF) model applies convolutional encoding to a multi-channel image representation of the full fleet; and the graph-floor (GF) model combines temporal attention with graph neural networks for spatial relationships. In this paper, we describe these models and present our evaluation of the impact of these design choices on prediction task performance. We find that the robot-centric and graph-floor models, which both use asynchronous robot state updates and incorporate the localized structure of robot interactions, show the most promise. We also present experiments that show that these two models can make effective use of larger warehouses operation datasets as the models are scaled up.
Amazon Is Using Specialized AI Agents for Deep Bug Hunting
Born out of an internal hackathon, Amazon's Autonomous Threat Analysis system uses a variety of specialized AI agents to detect weaknesses and propose fixes to the company's platforms. As generative AI pushes the speed of software development, it is also enhancing the ability of digital attackers to carry out financially motivated or state-backed hacks. This means that security teams at tech companies have more code than ever to review while dealing with even more pressure from bad actors. On Monday, Amazon will publish details for the first time of an internal system known as Autonomous Threat Analysis (ATA), which the company has been using to help its security teams proactively identify weaknesses in its platforms, perform variant analysis to quickly search for other, similar flaws, and then develop remediations and detection capabilities to plug holes before attackers find them. ATA was born out of an internal Amazon hackathon in August 2024, and security team members say that it has grown into a crucial tool since then.
What is AI poisoning? A computer scientist explains
Poisoning is a term most often associated with the human body and natural environments . But it is also a growing problem in the world of artificial intelligence (AI) - in particular, for large language models such as ChatGPT and Claude. In fact, a joint study by the UK AI Security Institute, Alan Turing Institute and Anthropic, published earlier this month, found that inserting as few as 250 malicious files into the millions in a model's training data can secretly "poison" it. So what exactly is AI poisoning? And what risks does it pose?
Steering Noncooperative Games Through Conjecture Design
Morri, Francesco, Cadre, Hรฉlรจne Le, Salas, David, Aussel, Didier
In dynamic noncooperative games, each player makes conjectures about other players' reactions before choosing a strategy. However, resulting equilibria may be multiple and do not always lead to desirable outcomes. These issues are typically addressed separately, for example, through opponent modelling and incentive design. Drawing inspiration from conjectural variations games, we propose an incentive design framework in which a coordinator first computes an equilibrium by optimizing a predefined objective function, then communicates this equilibrium as a target for the players to reach. In a centralized setting, the coordinator also optimizes the conjectures to steer the players towards the target. In decentralized settings, players independently compute conjectures and update their strategies based on individual targets. We provide a guarantee of equilibrium existence in both cases. This framework uses conjectures not only to guide the system towards desirable outcomes but also to decouple the game into independent optimization problems, enabling efficient computation and parallelization in large-scale settings. We illustrate our theoretical results on classical representative noncooperative games, demonstrating its application potential.
MDG: Masked Denoising Generation for Multi-Agent Behavior Modeling in Traffic Environments
Huang, Zhiyu, Zhou, Zewei, Cai, Tianhui, Zhang, Yun, Ma, Jiaqi
Modeling realistic and interactive multi-agent behavior is critical to autonomous driving and traffic simulation. However, existing diffusion and autoregressive approaches are limited by iterative sampling, sequential decoding, or task-specific designs, which hinder efficiency and reuse. We propose Masked Denoising Generation (MDG), a unified generative framework that reformulates multi-agent behavior modeling as the reconstruction of independently noised spatiotemporal tensors. Instead of relying on diffusion time steps or discrete tokenization, MDG applies continuous, per-agent and per-timestep noise masks that enable localized denoising and controllable trajectory generation in a single or few forward passes. This mask-driven formulation generalizes across open-loop prediction, closed-loop simulation, motion planning, and conditional generation within one model. Trained on large-scale real-world driving datasets, MDG achieves competitive closed-loop performance on the Waymo Sim Agents and nuPlan Planning benchmarks, while providing efficient, consistent, and controllable open-loop multi-agent trajectory generation. These results position MDG as a simple yet versatile paradigm for multi-agent behavior modeling.
Harnessing Data from Clustered LQR Systems: Personalized and Collaborative Policy Optimization
Kanakeri, Vinay, Bajaj, Shivam, Verma, Ashwin, Gupta, Vijay, Mitra, Aritra
It is known that reinforcement learning (RL) is data-hungry. To improve sample-efficiency of RL, it has been proposed that the learning algorithm utilize data from 'approximately similar' processes. However, since the process models are unknown, identifying which other processes are similar poses a challenge. In this work, we study this problem in the context of the benchmark Linear Quadratic Regulator (LQR) setting. Specifically, we consider a setting with multiple agents, each corresponding to a copy of a linear process to be controlled. The agents' local processes can be partitioned into clusters based on similarities in dynamics and tasks. Combining ideas from sequential elimination and zeroth-order policy optimization, we propose a new algorithm that performs simultaneous clustering and learning to output a personalized policy (controller) for each cluster. Under a suitable notion of cluster separation that captures differences in closed-loop performance across systems, we prove that our approach guarantees correct clustering with high probability. Furthermore, we show that the sub-optimality gap of the policy learned for each cluster scales inversely with the size of the cluster, with no additional bias, unlike in prior works on collaborative learning-based control. Our work is the first to reveal how clustering can be used in data-driven control to learn personalized policies that enjoy statistical gains from collaboration but do not suffer sub-optimality due to inclusion of data from dissimilar processes. From a distributed implementation perspective, our method is attractive as it incurs only a mild logarithmic communication overhead.