Goto

Collaborating Authors

 Agents


RoboComm: A DID-based scalable and privacy-preserving Robot-to-Robot interaction over state channels

arXiv.org Artificial Intelligence

In a multi robot system establishing trust amongst untrusted robots from different organisations while preserving a robot's privacy is a challenge. Recently decentralized technologies such as smart contract and blockchain are being explored for applications in robotics. However, the limited transaction processing and high maintenance cost hinder the widespread adoption of such approaches. Moreover, blockchain transactions be they on public or private permissioned blockchain are publically readable which further fails to preserve the confidentiality of the robot's data and privacy of the robot. In this work, we propose RoboComm a Decentralized Identity based approach for privacy-preserving interaction between robots. With DID a component of Self-Sovereign Identity; robots can authenticate each other independently without relying on any third-party service. Verifiable Credentials enable private data associated with a robot to be stored within the robot's hardware, unlike existing blockchain based approaches where the data has to be on the blockchain. We improve throughput by allowing message exchange over state channels. Being a blockchain backed solution RoboComm provides a trustworthy system without relying on a single party. Moreover, we implement our proposed approach to demonstrate the feasibility of our solution.


Synthesizing High-Quality Programming Tasks with LLM-based Expert and Student Agents

arXiv.org Artificial Intelligence

Generative AI is transforming computing education by enabling the automatic generation of personalized content and feedback. We investigate its capabilities in providing high-quality programming tasks to students. Despite promising advancements in task generation, a quality gap remains between AI-generated and expert-created tasks. The AI-generated tasks may not align with target programming concepts, could be incomprehensible to students, or may contain critical issues such as incorrect tests. Existing works often require interventions from human teachers for validation. We address these challenges by introducing PyTaskSyn, a novel synthesis technique that first generates a programming task and then decides whether it meets certain quality criteria to be given to students. The key idea is to break this process into multiple stages performed by expert and student agents simulated using both strong and weaker generative models. Through extensive evaluation, we show that PyTaskSyn significantly improves task quality compared to baseline techniques and showcases the importance of each specialized agent type in our validation pipeline. Additionally, we conducted user studies using our publicly available web application and show that PyTaskSyn can deliver high-quality programming tasks comparable to expert-designed ones while reducing workload and costs, and being more engaging than programming tasks that are available in online resources.


Discrete-Guided Diffusion for Scalable and Safe Multi-Robot Motion Planning

arXiv.org Artificial Intelligence

Multi-Robot Motion Planning (MRMP) involves generating collision-free trajectories for multiple robots operating in a shared continuous workspace. While discrete multi-agent path finding (MAPF) methods are broadly adopted due to their scalability, their coarse discretization severely limits trajectory quality. In contrast, continuous optimization-based planners offer higher-quality paths but suffer from the curse of dimensionality, resulting in poor scalability with respect to the number of robots. This paper tackles the limitations of these two approaches by introducing a novel framework that integrates discrete MAPF solvers with constrained generative diffusion models. The resulting framework, called Discrete-Guided Diffusion (DGD), has three key characteristics: (1) it decomposes the original nonconvex MRMP problem into tractable subproblems with convex configuration spaces, (2) it combines discrete MAPF solutions with constrained optimization techniques to guide diffusion models capture complex spatiotemporal dependencies among robots, and (3) it incorporates a lightweight constraint repair mechanism to ensure trajectory feasibility. The proposed method sets a new state-of-the-art performance in large-scale, complex environments, scaling to 100 robots while achieving planning efficiency and high success rates.


Symphony: A Decentralized Multi-Agent Framework for Scalable Collective Intelligence

arXiv.org Artificial Intelligence

Most existing Large Language Model (LLM)-based agent frameworks rely on centralized orchestration, incurring high deployment costs, rigid communication topologies, and limited adaptability. To address these challenges, we introduce Symphony, a decentralized multi-agent system which enables lightweight LLMs on consumer-grade GPUs to coordinate. Symphony introduces three key mechanisms: (1) a decentralized ledger that records capabilities, (2) a Beacon-selection protocol for dynamic task allocation, and (3) weighted result voting based on CoTs. This design forms a privacy-saving, scalable, and fault-tolerant orchestration with low overhead. Empirically, Symphony outperforms existing baselines on reasoning benchmarks, achieving substantial accuracy gains and demonstrating robustness across models of varying capacities.


SWIRL: A Staged Workflow for Interleaved Reinforcement Learning in Mobile GUI Control

arXiv.org Artificial Intelligence

The rapid advancement of large vision language models (L VLMs) and agent systems has heightened interest in mobile GUI agents that can reliably translate natural language into interface operations. Existing single-agent approaches, however, remain limited by structural constraints. Although multi-agent systems naturally decouple different competencies, recent progress in multi-agent reinforcement learning (MARL) has often been hindered by inefficiency and remains incompatible with current L VLM architectures. To address these challenges, we introduce SWIRL, a staged workflow for interleaved reinforcement learning designed for multi-agent systems. SWIRL reformulates MARL into a sequence of single-agent reinforcement learning tasks, updating one agent at a time while keeping the others fixed. This formulation enables stable training and promotes efficient coordination across agents. Theoretically, we provide a stepwise safety bound, a cross-round monotonic improvement theorem, and convergence guarantees on return, ensuring robust and principled optimization. In application to mobile GUI control, SWIRL instantiates a Navigator that converts language and screen context into structured plans, and an Interactor that grounds these plans into executable atomic actions. Extensive experiments demonstrate superior performance on both high-level and low-level GUI benchmarks. Beyond GUI tasks, SWIRL also demonstrates strong capability in multi-agent mathematical reasoning, underscoring its potential as a general framework for developing efficient and robust multi-agent systems.


CataractSurg-80K: Knowledge-Driven Benchmarking for Structured Reasoning in Ophthalmic Surgery Planning

arXiv.org Artificial Intelligence

Cataract surgery remains one of the most widely performed and effective procedures for vision restoration. Effective surgical planning requires integrating diverse clinical examinations for patient assessment, intraocular lens (IOL) selection, and risk evaluation. Large language models (LLMs) have shown promise in supporting clinical decision-making. However, existing LLMs often lack the domain-specific expertise to interpret heterogeneous ophthalmic data and provide actionable surgical plans. To enhance the model's ability to interpret heterogeneous ophthalmic reports, we propose a knowledge-driven Multi-Agent System (MAS), where each agent simulates the reasoning process of specialist ophthalmologists, converting raw clinical inputs into structured, actionable summaries in both training and deployment stages. Building on MAS, we introduce CataractSurg-80K, the first large-scale benchmark for cataract surgery planning that incorporates structured clinical reasoning. Each case is annotated with diagnostic questions, expert reasoning chains, and structured surgical recommendations. We further introduce Qwen-CSP, a domain-specialized model built on Qwen-4B, fine-tuned through a multi-stage process tailored for surgical planning. Comprehensive experiments show that Qwen-CSP outperforms strong general-purpose LLMs across multiple metrics. Our work delivers a high-quality dataset, a rigorous benchmark, and a domain-adapted LLM to facilitate future research in medical AI reasoning and decision support.


Flocking Behavior: An Innovative Inspiration for the Optimization of Production Plants

arXiv.org Artificial Intelligence

Optimizing modern production plants using the job-shop principle is a known hard problem. For very large plants, like semiconductor fabs, the problem becomes unsolvable on a plant-wide scale in a reasonable amount of time using classical linear optimization. An alternative approach is the use of swarm intelligence algorithms. These have been applied to the job-shop problem before, but often in a centrally calculated way where they are applied to the solution space, but they can be implemented in a bottom-up fashion to avoid global result computation as well. One of the problems in semiconductor production is that the production process requires a lot of switching between machines that process lots one after the other and machines that process batches of lots at once, often with long processing times. In this paper, we address this switching problem with the ``boids'' flocking algorithm that was originally used in robotics and movie industry. The flocking behavior is a bio-inspired algorithm that uses only local information and interaction based on simple heuristics. We show that this algorithm addresses these valid considerations in production plant optimization, as it reacts to the switching of machine kinds similar to how a swarm of flocking animals would react to obstacles in its course.


Your AI Bosses Are Still Prejudiced: The Emergence of Stereotypes in LLM-Based Multi-Agent Systems

arXiv.org Artificial Intelligence

While stereotypes are well-documented in human social interactions, AI systems are often presumed to be less susceptible to such biases. Previous studies have focused on biases inherited from training data, but whether stereotypes can emerge spontaneously in AI agent interactions merits further exploration. Through a novel experimental framework simulating workplace interactions with neutral initial conditions, we investigate the emergence and evolution of stereotypes in LLM-based multi-agent systems. Our findings reveal that (1) LLM-Based AI agents develop stereotype-driven biases in their interactions despite beginning without predefined biases; (2) stereotype effects intensify with increased interaction rounds and decision-making power, particularly after introducing hierarchical structures; (3) these systems exhibit group effects analogous to human social behavior, including halo effects, confirmation bias, and role congruity; and (4) these stereotype patterns manifest consistently across different LLM architectures. Through comprehensive quantitative analysis, these findings suggest that stereotype formation in AI systems may arise as an emergent property of multi-agent interactions, rather than merely from training data biases. Our work underscores the need for future research to explore the underlying mechanisms of this phenomenon and develop strategies to mitigate its ethical impacts.


PSO-Merging: Merging Models Based on Particle Swarm Optimization

arXiv.org Artificial Intelligence

Model merging has emerged as an efficient strategy for constructing multitask models by integrating the strengths of multiple available expert models, thereby reducing the need to fine-tune a pre-trained model for all the tasks from scratch. Existing data-independent methods struggle with performance limitations due to the lack of data-driven guidance. Data-driven approaches also face key challenges: gradient-based methods are computationally expensive, limiting their practicality for merging large expert models, whereas existing gradient-free methods often fail to achieve satisfactory results within a limited number of optimization steps. To address these limitations, this paper introduces PSO-Merging, a novel data-driven merging method based on the Particle Swarm Optimization (PSO). In this approach, we initialize the particle swarm with a pre-trained model, expert models, and sparsified expert models. We then perform multiple iterations, with the final global best particle serving as the merged model. Experimental results on different language models show that PSO-Merging generally outperforms baseline merging methods, offering a more efficient and scalable solution for model merging.


Efficient Human-Aware Task Allocation for Multi-Robot Systems in Shared Environments

arXiv.org Artificial Intelligence

-- Multi Robot Systems are increasingly deployed in applications, such as intralogistics or autonomous delivery, where multiple robots collaborate to complete tasks efficiently. One of the key factors enabling their efficient cooperation is Multi-Robot T ask Allocation (MRT A). Algorithms solving this problem optimize task distribution among robots to minimize the overall execution time. In shared environments, apart from the relative distance between the robots and the tasks, the execution time is also significantly impacted by the delay caused by navigating around moving people. However, most existing MRT A approaches are dynamics-agnostic, relying on static maps and neglecting human motion patterns, leading to inefficiencies and delays. In this paper, we introduce Human-A ware T ask Allocation (HA T A). This method leverages Maps of Dynamics (MoDs), spatio-temporal queryable models designed to capture historical human movement patterns, to estimate the impact of humans on the task execution time during deployment. HA T A utilizes a stochastic cost function that includes MoDs Experimental results show that integrating MoDs enhances task allocation performance, resulting in reduced mission completion times by up to 26% compared to the dynamics-agnostic method and up to 19% compared to the baseline. This work underscores the importance of considering human dynamics in MRT A within shared environments and presents an efficient framework for deploying multi-robot systems in environments populated by humans.