Goto

Collaborating Authors

 Agents


Exploring Equity of Climate Policies using Multi-Agent Multi-Objective Reinforcement Learning

arXiv.org Artificial Intelligence

Addressing climate change requires coordinated policy efforts of nations worldwide. These efforts are informed by scientific reports, which rely in part on Integrated Assessment Models (IAMs), prominent tools used to assess the economic impacts of climate policies. However, traditional IAMs optimize policies based on a single objective, limiting their ability to capture the trade-offs among economic growth, temperature goals, and climate justice. As a result, policy recommendations have been criticized for perpetuating inequalities, fueling disagreements during policy negotiations. We introduce Justice, the first framework integrating IAM with Multi-Objective Multi-Agent Reinforcement Learning (MOMARL). By incorporating multiple objectives, Justice generates policy recommendations that shed light on equity while balancing climate and economic goals. Further, using multiple agents can provide a realistic representation of the interactions among the diverse policy actors. We identify equitable Pareto-optimal policies using our framework, which facilitates deliberative decision-making by presenting policymakers with the inherent trade-offs in climate and economic policy.


Modeling and Control of Magnetic Forces between Microrobots

arXiv.org Artificial Intelligence

The independent control of multiple magnetic microrobots under a shared global signal presents critical challenges in biomedical applications such as targeted drug delivery and microsurgeries. Most existing systems only allow all agents to move synchronously, limiting their use in applications that require differentiated actuation. This research aims to design a controller capable of regulating the radial distance between micro-agents using only the angle ฯˆof a global magnetic field as the actuation parameter, demonstrating potential for practical applications. The proposed cascade control approach enables faster and more precise adjustment of the inter-agent distance than a proportional controller, while maintaining smooth transitions and avoiding abrupt changes in the orientation of the magnetic field, making it suitable for real-world implementation. A bibliographic review was conducted to develop the physical model, considering magnetic dipole-dipole interactions and velocities in viscous media. A PID controller was implemented to regulate the radial distance, followed by a PD controller in cascade to smooth changes in field orientation. These controllers were simulated in MATLAB, showing that the PID controller reduced convergence time to the desired radius by about 40%. When adding the second controller, the combined PID+PD scheme achieved smooth angular trajectories within similar timeframes, with fluctuations of only \pm 5^\circ. These results validate the feasibility of controlling the radial distance of two microrobots using a shared magnetic field in a fast and precise manner, without abrupt variations in the control angle. However, the model is limited to a 2D environment and two agents, suggesting future research to extend the controller to 3D systems and multiple agents.


Emergent Convergence in Multi-Agent LLM Annotation

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly deployed in collaborative settings, yet little is known about how they coordinate when treated as black-box agents. We simulate 7500 multi-agent, multi-round discussions in an inductive coding task, generating over 125000 utterances that capture both final annotations and their interactional histories. We introduce process-level metrics: code stability, semantic self-consistency, and lexical confidence alongside sentiment and convergence measures, to track coordination dynamics. To probe deeper alignment signals, we analyze the evolving geometry of output embeddings, showing that intrinsic dimensionality declines over rounds, suggesting semantic compression. The results reveal that LLM groups converge lexically and semantically, develop asymmetric influence patterns, and exhibit negotiation-like behaviors despite the absence of explicit role prompting. This work demonstrates how black-box interaction analysis can surface emergent coordination strategies, offering a scalable complement to internal probe-based interpretability methods.


A Survey on Improving Human Robot Collaboration through Vision-and-Language Navigation

arXiv.org Artificial Intelligence

Vision-and-Language Navigation (VLN) is a multi-modal, cooperative task requiring agents to interpret human instructions, navigate 3D environments, and communicate effectively under ambiguity. This paper presents a comprehensive review of recent VLN advancements in robotics and outlines promising directions to improve multi-robot coordination. Despite progress, current models struggle with bidirectional communication, ambiguity resolution, and collaborative decision-making in the multi-agent systems. We review approximately 200 relevant articles to provide an in-depth understanding of the current landscape. Through this survey, we aim to provide a thorough resource that inspires further research at the intersection of VLN and robotics. We advocate that the future VLN systems should support proactive clarification, real-time feedback, and contextual reasoning through advanced natural language understanding (NLU) techniques. Additionally, decentralized decision-making frameworks with dynamic role assignment are essential for scalable, efficient multi-robot collaboration. These innovations can significantly enhance human-robot interaction (HRI) and enable real-world deployment in domains such as healthcare, logistics, and disaster response.


A Comprehensive Survey on Surgical Digital Twin

arXiv.org Artificial Intelligence

Such models are integral to the development of context-aware surgical training systems and process monitoring platforms [11], [19] as well as for encoding adaptive robotic control policies in teleoperated environments [13], [20], [78]. However, their limited capacity to capture continuous biophysical dynamics can constrain their utility in applications where physiological fidelity is essential. Recognizing the limitations inherent in purely continuous or discrete approaches, hybrid modeling strategies have emerged as a state-of-the-art solution for surgical digital twins. These frameworks integrate continuous dynamic models with discrete state machines, enabling the simultaneous tracking of physiological changes and procedural events [8], [7], [19], [37]. For example, hybrid automata have been deployed to synchronize real-time updates of tissue deformation with the sequencing of surgical tool actions [7], [19]. This integration allows digital twins to provide context-sensitive support, adapting to abrupt workflow transitions and physiological perturbations alike--a critical requirement in both routine and emergent surgical scenarios [8], [11], [7]. B. Mutual Information and Information-Theoretic Approaches With the proliferation of multi-modal surgical data, information-theoretic concepts have become indispensable for quantifying uncertainty, relevance, and redundancy across heterogeneous information streams. Mutual information I(X; Y) has been adopted as a rigorous metric for selecting the most informative sensors, imaging modalities, or clinical parameters, thereby enhancing the efficiency and robustness of digital twin-enabled decision support [2], [3], [13], [34], [11], [51], [48], [26], [29]. This is formally captured as Eq.


X-SYCON: Xylem-Inspired Passive Gradient Control for Communication-Free Swarm Response in Dynamic Disaster Environments

arXiv.org Artificial Intelligence

We present X-SYCON, a xylem-inspired multi-agent architecture in which coordination emerges from passive field dynamics rather than explicit planning or communication. Incidents (demands) and obstructions (hazards) continually write diffusing and decaying scalar fields, and agents greedily ascend a local utility $U=ฯ•_{\mathrm{DE}}-ฮบ\,ฯ•_{\mathrm{HZ}}$ with light anti-congestion and separation. A beaconing rule triggered on first contact temporarily deepens the local demand sink, accelerating completion without reducing time-to-first-response. Across dynamic, partially blocked simulated environments, we observe low miss rates and stable throughput with interpretable, tunable trade-offs over carrier count, arrival rate, hazard density, and hazard sensitivity $ฮบ$. We derive that a characteristic hydraulic length scale $\ell\approx\sqrt{D/ฮป}$ predicts recruitment range in a continuum approximation, and we provide a work-conservation (Ohm-law) bound consistent with sublinear capacity scaling with team size. Empirically: (i) soft hazard penalties yield fewer misses when obstacles already block motion; (ii) throughput saturates sublinearly with carriers while reliability improves sharply; (iii) stronger arrivals can reduce misses by sustaining sinks that recruit help; and (iv) phase-stability regions shrink with hazard density but are recovered by more carriers or higher arrivals. We refer to X-SYCON as an instance of Distributed Passive Computation and Control, and we evaluate it in simulations modeling communication-denied disaster response and other constrained sensing-action regimes.


DREAMer-VXS: A Latent World Model for Sample-Efficient AGV Exploration in Stochastic, Unobserved Environments

arXiv.org Artificial Intelligence

The paradigm of learning-based robotics holds immense promise, yet its translation to real-world applications is critically hindered by the sample inefficiency and brittleness of conventional model-free reinforcement learning algorithms. In this work, we address these challenges by introducing DREAMer-VXS, a model-based framework for Autonomous Ground Vehicle (AGV) exploration that learns to plan from imagined latent trajectories. Our approach centers on learning a comprehensive world model from partial and high-dimensional LiDAR observations. This world model is composed of a Convolutional Variational Autoencoder (VAE), which learns a compact representation of the environment's structure, and a Recurrent State-Space Model (RSSM), which models complex temporal dynamics. By leveraging this learned model as a high-speed simulator, the agent can train its navigation policy almost entirely in imagination. This methodology decouples policy learning from real-world interaction, culminating in a 90% reduction in required environmental interactions to achieve expert-level performance when compared to state-of-the-art model-free SAC baselines. The agent's behavior is guided by an actor-critic policy optimized with a composite reward function that balances task objectives with an intrinsic curiosity bonus, promoting systematic exploration of unknown spaces. We demonstrate through extensive simulated experiments that DREAMer-VXS not only learns orders of magnitude faster but also develops more generalizable and robust policies, achieving a 45% increase in exploration efficiency in unseen environments and superior resilience to dynamic obstacles.


Forthcoming machine learning and AI seminars: December 2025 edition

AIHub

This post contains a list of the AI-related seminars that are scheduled to take place between 1 December 2025 and 31 January 2026. All events detailed here are free and open for anyone to attend virtually. Dick den Hertog (University of Amsterdam) Association of European Operational Research Societies To receive the seminar link, sign up to the mailing list . Annabelle Gawer The Digital Humanism (DIGHUM) Initiative The talk will be livestreamed on YouTube here . Jesรบs Moreno-Leรณn (University of Seville) Raspberry PI Sign up here to join.


Why companies don't share AV crash data โ€“ and how they could

Robohub

Autonomous vehicles (AVs) have been tested as taxis for decades in San Francisco, Pittsburgh and around the world, and trucking companies have enormous incentives to adopt them. But AV companies rarely share the crash-and safety-related data that is crucial to improving the safety of their vehicles - mostly because they have little incentive to do so. Is AV safety data an auto company's intellectual asset or a public good? It can be both - with a little tweaking, according to a team of Cornell researchers. The team has created a roadmap outlining the barriers and opportunities to encourage AV companies to share the data to make AVs safer, from untangling public versus private data knowledge, to regulations to creating incentive programs.


Breaking Algorithmic Collusion in Human-AI Ecosystems

arXiv.org Artificial Intelligence

AI agents are increasingly deployed in ecosystems where they repeatedly interact not only with each other but also with humans. In this work, we study these human-AI ecosystems from a theoretical perspective, focusing on the classical framework of repeated pricing games. In our stylized model, the AI agents play equilibrium strategies, and one or more humans manually perform the pricing task instead of adopting an AI agent, thereby defecting to a no-regret strategy. Motivated by how populations of AI agents can sustain supracompetitive prices, we investigate whether high prices persist under such defections. Our main finding is that even a single human defection can destabilize collusion and drive down prices, and multiple defections push prices even closer to competitive levels. We further show how the nature of collusion changes under defection-aware AI agents. Taken together, our results characterize when algorithmic collusion is fragile--and when it persists--in mixed ecosystems of AI agents and humans.