Goto

Collaborating Authors

 Planning & Scheduling


Probabilistic Bubble Roadmap

arXiv.org Artificial Intelligence

SE Department of Information T echnology, Uppsala University Abstract Finding a collision-free path is a fundamental problem in robotics, where the sampling based planners have a long line of success. However, this approach is computationally expensive, due to the frequent use of collision-detection. Furthermore, the produced paths are usually jagged and require further post-processing before they can be tracked. Due to their high computational cost, these planners are usually restricted to static settings, since they are not able to cope with rapid changes in the environment. In our work, we remove this restriction by introducing a learned signed distance function expressed in the configuration space of the robot. The signed distance allows us to form collision-free spherical regions in the configuration space, which we use to suggest a new multi-query path planner that also works in dynamic settings. We propose the probabilistic bubble roadmap planner, which enhances the probabilistic roadmap planner (PRM) by using spheres as vertices and compute the edges by checking for neighboring spheres which intersect. We benchmark our approach in a static setting where we show that we can produce paths that are shorter than the paths produced by the PRM, while having a smaller sized roadmap and finding the paths faster. Finally, we show that we can rapidly rewire the graph in the case of new obstacles introduced at run time and therefore produce paths in the case of moving obstacles. Keywords: Motion planning, Signed distance function, Manipulators, Robotics 1. Introduction Motion planning is the problem of finding a collision-free trajectory that connects a given start and goal configuration. The planning takes place in the configuration space of the robot.


PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving

arXiv.org Artificial Intelligence

Recent agent frameworks and inference-time algorithms often struggle with complex planning problems due to limitations in verifying generated plans or reasoning and varying complexity of instances within a single task. Many existing methods for these tasks either perform task-level verification without considering constraints or apply inference-time algorithms without adapting to instance-level complexity. To address these limitations, we propose PlanGEN, a model-agnostic and easily scalable agent framework with three key components: constraint, verification, and selection agents. Specifically, our approach proposes constraint-guided iterative verification to enhance performance of inference-time algorithms--Best of N, Tree-of-Thought, and REBASE. In PlanGEN framework, the selection agent optimizes algorithm choice based on instance complexity, ensuring better adaptability to complex planning problems. Experimental results demonstrate significant improvements over the strongest baseline across multiple benchmarks, achieving state-of-the-art results on NATURAL PLAN ($\sim$8%$\uparrow$), OlympiadBench ($\sim$4%$\uparrow$), DocFinQA ($\sim$7%$\uparrow$), and GPQA ($\sim$1%$\uparrow$). Our key finding highlights that constraint-guided iterative verification improves inference-time algorithms, and adaptive selection further boosts performance on complex planning and reasoning problems.


Towards Autonomous Navigation of Neuroendovascular Tools for Timely Stroke Treatment via Contact-aware Path Planning

arXiv.org Artificial Intelligence

In this paper, we propose a model-based contact-aware motion planner for autonomous navigation of neuroendovascular tools in acute ischemic stroke. The planner is designed to find the optimal control strategy for telescopic pre-bent catheterization tools such as guidewire and catheters, currently used for neuroendovascular procedures. A kinematic model for the telescoping tools and their interaction with the surrounding anatomy is derived to predict tools steering. By leveraging geometrical knowledge of the anatomy, obtained from pre-operative segmented 3D images, and the mechanics of the telescoping tools, the planner finds paths to the target enabled by interacting with the surroundings. We propose an actuation platform for insertion and rotation of the telescopic tools and present experimental results for the navigation from the base of the descending aorta to the LCCA. We demonstrate that, by leveraging the pre-operative plan, we can consistently navigate the LCCA with 100% success of over 50 independent trials. We also study the robustness of the planner towards motion of the aorta and errors in the initial positioning of the robotic tools. The proposed plan can successfully reach the LCCA for rotations of the aorta of up to 10{\deg}, and displacement of up to 10mm, on the coronal plane.


IA-TIGRIS: An Incremental and Adaptive Sampling-Based Planner for Online Informative Path Planning

arXiv.org Artificial Intelligence

Planning paths that maximize information gain for robotic platforms has wide-ranging applications and significant potential impact. To effectively adapt to real-time data collection, informative path planning must be computed online and be responsive to new observations. In this work, we present IA-TIGRIS, an incremental and adaptive sampling-based informative path planner that can be run efficiently with onboard computation. Our approach leverages past planning efforts through incremental refinement while continuously adapting to updated world beliefs. We additionally present detailed implementation and optimization insights to facilitate real-world deployment, along with an array of reward functions tailored to specific missions and behaviors. Extensive simulation results demonstrate IA-TIGRIS generates higher-quality paths compared to baseline methods. We validate our planner on two distinct hardware platforms: a hexarotor UAV and a fixed-wing UAV, each having unique motion models and configuration spaces. Our results show up to a 41% improvement in information gain compared to baseline methods, suggesting significant potential for deployment in real-world applications.


Exploring Embodied Multimodal Large Models: Development, Datasets, and Future Directions

arXiv.org Artificial Intelligence

Embodied multimodal large models (EMLMs) have gained significant attention in recent years due to their potential to bridge the gap between perception, cognition, and action in complex, real-world environments. This comprehensive review explores the development of such models, including Large Language Models (LLMs), Large Vision Models (LVMs), and other models, while also examining other emerging architectures. We discuss the evolution of EMLMs, with a focus on embodied perception, navigation, interaction, and simulation. Furthermore, the review provides a detailed analysis of the datasets used for training and evaluating these models, highlighting the importance of diverse, high-quality data for effective learning. The paper also identifies key challenges faced by EMLMs, including issues of scalability, generalization, and real-time decision-making. Finally, we outline future directions, emphasizing the integration of multimodal sensing, reasoning, and action to advance the development of increasingly autonomous systems. By providing an in-depth analysis of state-of-the-art methods and identifying critical gaps, this paper aims to inspire future advancements in EMLMs and their applications across diverse domains.


VB-Com: Learning Vision-Blind Composite Humanoid Locomotion Against Deficient Perception

arXiv.org Artificial Intelligence

The performance of legged locomotion is closely tied to the accuracy and comprehensiveness of state observations. Blind policies, which rely solely on proprioception, are considered highly robust due to the reliability of proprioceptive observations. However, these policies significantly limit locomotion speed and often require collisions with the terrain to adapt. In contrast, Vision policies allows the robot to plan motions in advance and respond proactively to unstructured terrains with an online perception module. However, perception is often compromised by noisy real-world environments, potential sensor failures, and the limitations of current simulations in presenting dynamic or deformable terrains. Humanoid robots, with high degrees of freedom and inherently unstable morphology, are particularly susceptible to misguidance from deficient perception, which can result in falls or termination on challenging dynamic terrains. To leverage the advantages of both vision and blind policies, we propose VB-Com, a composite framework that enables humanoid robots to determine when to rely on the vision policy and when to switch to the blind policy under perceptual deficiency. We demonstrate that VB-Com effectively enables humanoid robots to traverse challenging terrains and obstacles despite perception deficiencies caused by dynamic terrains or perceptual noise.


Planning, scheduling, and execution on the Moon: the CADRE technology demonstration mission

arXiv.org Artificial Intelligence

NASA's Cooperative Autonomous Distributed Robotic Exploration (CADRE) mission, slated for flight to the Moon's Reiner Gamma region in 2025/2026, is designed to demonstrate multi-agent autonomous exploration of the Lunar surface and sub-surface. A team of three robots and a base station will autonomously explore a region near the lander, collecting the data required for 3D reconstruction of the surface with no human input; and then autonomously perform distributed sensing with multi-static ground penetrating radars (GPR), driving in formation while performing coordinated radar soundings to create a map of the subsurface. At the core of CADRE's software architecture is a novel autonomous, distributed planning, scheduling, and execution (PS&E) system. The system coordinates the robots' activities, planning and executing tasks that require multiple robots' participation while ensuring that each individual robot's thermal and power resources stay within prescribed bounds, and respecting ground-prescribed sleep-wake cycles. The system uses a centralized-planning, distributed-execution paradigm, and a leader election mechanism ensures robustness to failures of individual agents. In this paper, we describe the architecture of CADRE's PS&E system; discuss its design rationale; and report on verification and validation (V&V) testing of the system on CADRE's hardware in preparation for deployment on the Moon.


Motion planning for highly-dynamic unconditioned reflexes based on chained Signed Distance Functions

arXiv.org Artificial Intelligence

The unconditioned reflex (e.g., protective reflex), which is the innate reaction of the organism and usually performed through the spinal cord rather than the brain, can enable organisms to escape harms from environments. In this paper, we propose an online, highly-dynamic motion planning algorithm to endow manipulators the highly-dynamic unconditioned reflexes to humans and/or environments. Our method is based on a chained version of Signed Distance Functions (SDFs), which can be pre-computed and stored. Our proposed algorithm is divided into two stages. In the offline stage, we create 3 groups of local SDFs to store the geometric information of the manipulator and its working environment. In the online stage, the pre-computed local SDFs are chained together according the configuration of the manipulator, to provide global geometric information about the environment. While the point clouds of the dynamic objects serve as query points to look up these local SDFs for quickly generating escape velocity. Then we propose a modified geometric Jacobian matrix and use the Jacobian-pseudo-inverse method to generate real-time reflex behaviors to avoid the static and dynamic obstacles in the environment. The benefits of our method are validated in both static and dynamic scenarios. In the static scenario, our method identifies the path solutions with lower time consumption and shorter trajectory length compared to existing solutions. In the dynamic scenario, our method can reliably pursue the dynamic target point, avoid dynamic obstacles, and react to these obstacles within 1ms, which surpasses the unconditioned reflex reaction time of humans.


NTP-INT: Network Traffic Prediction-Driven In-band Network Telemetry for High-load Switches

arXiv.org Artificial Intelligence

In-band network telemetry (INT) is essential to network management due to its real-time visibility. However, because of the rapid increase in network devices and services, it has become crucial to have targeted access to detailed network information in a dynamic network environment. This paper proposes an intelligent network telemetry system called NTP-INT to obtain more fine-grained network information on high-load switches. Specifically, NTP-INT consists of three modules: network traffic prediction module, network pruning module, and probe path planning module. Firstly, the network traffic prediction module adopts a Multi-Temporal Graph Neural Network (MTGNN) to predict future network traffic and identify high-load switches. Then, we design the network pruning algorithm to generate a subnetwork covering all high-load switches to reduce the complexity of probe path planning. Finally, the probe path planning module uses an attention-mechanism-based deep reinforcement learning (DEL) model to plan efficient probe paths in the network slice. The experimental results demonstrate that NTP-INT can acquire more precise network information on high-load switches while decreasing the control overhead by 50\%.


Path Planning for Masked Diffusion Model Sampling

arXiv.org Artificial Intelligence

In this paper, we explore how token unmasking order influences generative quality in masked diffusion models (MDMs). We derive an expanded evidence lower bound (ELBO) that introduces a planner to select which tokens to unmask at each step. Our analysis reveals that alternative unmasking strategies can enhance generation performance. Building on this, we propose Path Planning (P2), a sampling framework that uses a pre-trained BERT model or the denoiser itself to guide unmasking decisions. P2 generalizes all known MDM sampling strategies and significantly improves performance across diverse domains, including language generation (in-context learning, code generation, story infilling, mathematical reasoning, reverse curse correction) and biological sequence generation (protein and RNA sequences).