Goto

Collaborating Authors

 Agents


Rule-Based Conflict-Free Decision Framework in Swarm Confrontation

arXiv.org Artificial Intelligence

Traditional rule-based decision-making methods with interpretable advantage, such as finite state machine, suffer from the jitter or deadlock(JoD) problems in extremely dynamic scenarios. To realize agent swarm confrontation, decision conflicts causing many JoD problems are a key issue to be solved. Here, we propose a novel decision-making framework that integrates probabilistic finite state machine, deep convolutional networks, and reinforcement learning to implement interpretable intelligence into agents. Our framework overcomes state machine instability and JoD problems, ensuring reliable and adaptable decisions in swarm confrontation. The proposed approach demonstrates effective performance via enhanced human-like cooperation and competitive strategies in the rigorous evaluation of real experiments, outperforming other methods.


ReAgent: Reversible Multi-Agent Reasoning for Knowledge-Enhanced Multi-Hop QA

arXiv.org Artificial Intelligence

Recent advances in large language models (LLMs) have significantly improved multi-hop question answering (QA) through direct Chain-of-Thought (CoT) reasoning. However, the irreversible nature of CoT leads to error accumulation, making it challenging to correct mistakes in multi-hop reasoning. This paper introduces ReAgent: a Reversible multi-Agent collaborative framework augmented with explicit backtracking mechanisms, enabling reversible multi-hop reasoning. By incorporating text-based retrieval, information aggregation and validation, our system can detect and correct errors mid-reasoning, leading to more robust and interpretable QA outcomes. The framework and experiments serve as a foundation for future work on error-tolerant QA systems. Empirical evaluations across three benchmarks indicate ReAgent's efficacy, yielding average about 6\% improvements against baseline models.


Small-Scale Testbeds for Connected and Automated Vehicles and Robot Swarms: Challenges and a Roadmap

arXiv.org Artificial Intelligence

This article proposes a roadmap to address the current challenges in small-scale testbeds for Connected and Automated Vehicles (CAVs) and robot swarms. The roadmap is a joint effort of participants in the workshop "1st Workshop on Small-Scale Testbeds for Connected and Automated Vehicles and Robot Swarms," held on June 2 at the IEEE Intelligent Vehicles Symposium (IV) 2024 in Jeju, South Korea. The roadmap contains three parts: 1) enhancing accessibility and diversity, especially for underrepresented communities, 2) sharing best practices for the development and maintenance of testbeds, and 3) connecting testbeds through an abstraction layer to support collaboration. The workshop features eight invited speakers, four contributed papers [1]-[4], and a presentation of a survey paper on testbeds [5]. The survey paper provides an online comparative table of more than 25 testbeds, available at https://bassamlab.github.io/testbeds-survey. The workshop's own website is available at https://cpm-remote.lrt.unibw-muenchen.de/iv24-workshop.


Less is more? Rewards in RL for Cyber Defence

arXiv.org Artificial Intelligence

The last few years have seen an explosion of interest in autonomous cyber defence agents based on deep reinforcement learning. Such agents are typically trained in a cyber gym environment, also known as a cyber simulator, at least 32 of which have already been built. Most, if not all cyber gyms provide dense "scaffolded" reward functions which combine many penalties or incentives for a range of (un)desirable states and costly actions. Whilst dense rewards help alleviate the challenge of exploring complex environments, yielding seemingly effective strategies from relatively few environment steps; they are also known to bias the solutions an agent can find, potentially towards suboptimal solutions. This is especially a problem in complex cyber environments where policy weaknesses may not be noticed until exploited by an adversary. In this work we set out to evaluate whether sparse reward functions might enable training more effective cyber defence agents. Towards this goal we first break down several evaluation limitations in existing work by proposing a ground truth evaluation score that goes beyond the standard RL paradigm used to train and evaluate agents. By adapting a well-established cyber gym to accommodate our methodology and ground truth score, we propose and evaluate two sparse reward mechanisms and compare them with a typical dense reward. Our evaluation considers a range of network sizes, from 2 to 50 nodes, and both reactive and proactive defensive actions. Our results show that sparse rewards, particularly positive reinforcement for an uncompromised network state, enable the training of more effective cyber defence agents. Furthermore, we show that sparse rewards provide more stable training than dense rewards, and that both effectiveness and training stability are robust to a variety of cyber environment considerations.


Who bought this smoked salmon? How 'AI agents' will change the internet (and shopping lists)

The Guardian

Armed with my shopping list, it types each item into the search bar of a supermarket website, then uses its cursor to click. Watching what appears to be a digital ghost do this usually mundane task is strangely transfixing. "Are you sure it's not just a person in India?" my husband asks, peering over my shoulder. Made available to UK users last month, it has a similar text interface and conversational tone to ChatGPT, but rather than just answering questions, it can actually do things โ€“ provided they involve navigating a web browser. Hot on the heels of large language models, AI agents have been trumpeted as the next big thing, and you can see the appeal: a digital assistant that can complete practical tasks is more compelling than one that can just talk back.


Chance-constrained Linear Quadratic Gaussian Games for Multi-robot Interaction under Uncertainty

arXiv.org Artificial Intelligence

We address safe multi-robot interaction under uncertainty. In particular, we formulate a chance-constrained linear quadratic Gaussian game with coupling constraints and system uncertainties. We find a tractable reformulation of the game and propose a dual ascent algorithm. We prove that the algorithm converges to a generalized Nash equilibrium of the reformulated game, ensuring the satisfaction of the chance constraints. We test our method in driving simulations and real-world robot experiments. Our method ensures safety under uncertainty and generates less conservative trajectories than single-agent model predictive control.


Unlocking Generalization for Robotics via Modularity and Scale

arXiv.org Artificial Intelligence

How can we build generalist robot systems? Scale may not be enough due to the significant multimodality of robotics tasks, lack of easily accessible data and the challenges of deploying on physical hardware. Meanwhile, most deployed robotic systems today are inherently modular and can leverage the independent generalization capabilities of each module to perform well. Therefore, this thesis seeks to tackle the task of building generalist robot agents by integrating these components into one: combining modularity with large-scale learning for general purpose robot control. The first question we consider is: how can we build modularity and hierarchy into learning systems? Our key insight is that rather than having the agent learn hierarchy and low-level control end-to-end, we can enforce modularity via planning to enable more efficient and capable robot learners. Next, we come to the role of scale in building generalist robot systems. To scale, neural networks require vast amounts of diverse data, expressive architectures to fit the data and a source of supervision to generate the data. We leverage a powerful supervision source: classical planning, which can generalize, but is expensive to run and requires access to privileged information to perform well in practice. We use these planners to supervise large-scale policy learning in simulation to produce generalist agents. Finally, we consider how to unify modularity with large-scale policy learning to build real-world robot systems capable of performing zero-shot manipulation. We do so by tightly integrating key ingredients of modular high and mid-level planning, learned local control, procedural scene generation and large-scale policy learning for sim2real transfer. We demonstrate that this recipe can produce a single, generalist agent that can solve challenging long-horizon manipulation tasks in the real world.


The potential role of AI agents in transforming nuclear medicine research and cancer management in India

arXiv.org Artificial Intelligence

India faces a significant cancer burden, with an incidence - to - mortality ratio indicating that nearly three out of five individuals diagnosed with cancer succumb to the disease. While the limitations of physical healthcare infrastructure are widely acknowledged as a primary challenge, concerted efforts by government and healthcare agencies are underway to mitigate these constraints. However, given the country's vast geography and high population density, it is imperative to explore alternative soft infrastructure solutions to complement existing frameworks . Artificial Intelligence agents are increasingly transforming problem - solving approaches across various domains, with their application in medicine proving particularly transformative. In this perspective, we examine the potential role of AI agents in advancing nuclear medicine fo r cancer research, diagnosis, and management in India. We begin with a brief overview of AI agents and their capabilities, followed by a proposed agent - based ecosystem that can address prevailing sustainability challenges in India's nuclear medicine. Keywords: AI Agents; cancer; nuclear medicine ecosystem; sustainability challenges 1. Introduction India's with population of 1.4 billion faces a significant cancer burden, with ~1.5 million new cases and ~850,000 deaths annually [1] [2] . With an i ncidence - to - m ortality p ercentage of approximately 64.8%, nearly three out of five individuals diagnosed with cancer are expected to succumb to the disease [2] . Projections indicate that mortality rates will rise significantly, increasing from 64.7% to 109.6% between 2022 and 2050, largely due to demographic shifts as the reproductive - age population transitions into middle and old age. This growing cancer burden will place even more pressure on the already overburdened healthcare system, making it essential to address the gaps in both infrastructure and indigenous research and innovations to ensure timely and effective patient treatment [3] . This trend underscores the urgent need for a resilient, patient - centred framework that integrates medical advancements, early detection through diagnostics, timely therapeutic interventions, and equitable access to care. Nuclear medicine uses a small amount of targeted radioactive material to diagnose and treat cancer [4] .


Think Twice, Click Once: Enhancing GUI Grounding via Fast and Slow Systems

arXiv.org Artificial Intelligence

Humans can flexibly switch between different modes of thinking based on task complexity: from rapid intuitive judgments to in-depth analytical understanding. However, current Graphical User Interface (GUI) grounding systems which locate interface elements based on natural language instructions rely solely on immediate prediction without reasoning, struggling to understand complex interface layouts with nested structures and hierarchical relationships, limiting their effectiveness on complex interfaces. Inspired by human dual-system cognition, we present Focus, a novel GUI grounding framework that combines fast prediction with systematic analysis. The framework dynamically switches between rapid and deliberate processing through an adaptive system switching based on task complexity, optimizing both efficiency and accuracy. Focus decomposes grounding into progressive stages: interface summarization, visual focused analysis, and precise coordinate prediction. This structured decomposition enables systematic understanding of both interface layouts and visual relationships. Extensive experiments show that Focus achieves state-of-the-art performance using only 300K of the training data with a 2B parameter model compared to existing approaches. Focus demonstrates superior performance particularly in complex GUI scenarios, achieving 77.4% average accuracy on ScreenSpot and 13.3% on the more challenging ScreenSpot-Pro. Our analysis reveals the effectiveness of this dual-system approach while demonstrating its potential for improving complex GUI interaction scenarios.


Collective Behavior Clone with Visual Attention via Neural Interaction Graph Prediction

arXiv.org Artificial Intelligence

In this paper, we propose a framework, collective behavioral cloning (CBC), to learn the underlying interaction mechanism and control policy of a swarm system. Given the trajectory data of a swarm system, we propose a graph variational autoencoder (GVAE) to learn the local interaction graph. Based on the interaction graph and swarm trajectory, we use behavioral cloning to learn the control policy of the swarm system. To demonstrate the practicality of CBC, we deploy it on a real-world decentralized vision-based robot swarm system. A visual attention network is trained based on the learned interaction graph for online neighbor selection. Experimental results show that our method outperforms previous approaches in predicting both the interaction graph and swarm actions with higher accuracy. This work offers a promising approach for understanding interaction mechanisms and swarm dynamics in future swarm robotics research. Code and data are available.