Planning & Scheduling
Array-Based Monte Carlo Tree Search
Ragan, James, Hadaegh, Fred Y., Chung, Soon-Jo
Monte Carlo Tree Search is a popular method for solving decision making problems. Faster implementations allow for more simulations within the same wall clock time, directly improving search performance. To this end, we present an alternative array-based implementation of the classic Upper Confidence bounds applied to Trees algorithm. Our method preserves the logic of the original algorithm, but eliminates the need for branch prediction, enabling faster performance on pipelined processors, and up to a factor of 2.8 times better scaling with search depth in our numerical simulations.
Reinforcement Learning for Search Tree Size Minimization in Constraint Programming: New Results on Scheduling Benchmarks
Heinz, Vilém, Vilím, Petr, Hanzálek, Zdeněk
Failure-Directed Search (FDS) is a significant complete generic search algorithm used in Constraint Programming (CP) to efficiently explore the search space, proven particularly effective on scheduling problems. This paper analyzes FDS's properties, showing that minimizing the size of its search tree guided by ranked branching decisions is closely related to the Multi-armed bandit (MAB) problem. Building on this insight, MAB reinforcement learning algorithms are applied to FDS, extended with problem-specific refinements and parameter tuning, and evaluated on the two most fundamental scheduling problems, the Job Shop Scheduling Problem (JSSP) and Resource-Constrained Project Scheduling Problem (RCPSP). The resulting enhanced FDS, using the best extended MAB algorithm and configuration, performs 1.7 times faster on the JSSP and 2.1 times faster on the RCPSP benchmarks compared to the original implementation in a new solver called OptalCP, while also being 3.5 times faster on the JSSP and 2.1 times faster on the RCPSP benchmarks than the current state-of-the-art FDS algorithm in IBM CP Optimizer 22.1. Furthermore, using only a 900-second time limit per instance, the enhanced FDS improved the existing state-of-the-art lower bounds of 78 of 84 JSSP and 226 of 393 RCPSP standard open benchmark instances while also completely closing a few of them.
APT*: Asymptotically Optimal Motion Planning via Adaptively Prolated Elliptical R-Nearest Neighbors
Zhang, Liding, Wang, Sicheng, Cai, Kuanqi, Bing, Zhenshan, Wu, Fan, Wang, Chaoqun, Haddadin, Sami, Knoll, Alois
Optimal path planning aims to determine a sequence of states from a start to a goal while accounting for planning objectives. Popular methods often integrate fixed batch sizes and neglect information on obstacles, which is not problem-specific. This study introduces Adaptively Prolated Trees (APT*), a novel sampling-based motion planner that extends based on Force Direction Informed Trees (FDIT*), integrating adaptive batch-sizing and elliptical $r$-nearest neighbor modules to dynamically modulate the path searching process based on environmental feedback. APT* adjusts batch sizes based on the hypervolume of the informed sets and considers vertices as electric charges that obey Coulomb's law to define virtual forces via neighbor samples, thereby refining the prolate nearest neighbor selection. These modules employ non-linear prolate methods to adaptively adjust the electric charges of vertices for force definition, thereby improving the convergence rate with lower solution costs. Comparative analyses show that APT* outperforms existing single-query sampling-based planners in dimensions from $\mathbb{R}^4$ to $\mathbb{R}^{16}$, and it was further validated through a real-world robot manipulation task. A video showcasing our experimental results is available at: https://youtu.be/gCcUr8LiEw4
Efficient Human-Aware Task Allocation for Multi-Robot Systems in Shared Environments
Eskeri, Maryam Kazemi, Kyrki, Ville, Baumann, Dominik, Kucner, Tomasz Piotr
-- Multi Robot Systems are increasingly deployed in applications, such as intralogistics or autonomous delivery, where multiple robots collaborate to complete tasks efficiently. One of the key factors enabling their efficient cooperation is Multi-Robot T ask Allocation (MRT A). Algorithms solving this problem optimize task distribution among robots to minimize the overall execution time. In shared environments, apart from the relative distance between the robots and the tasks, the execution time is also significantly impacted by the delay caused by navigating around moving people. However, most existing MRT A approaches are dynamics-agnostic, relying on static maps and neglecting human motion patterns, leading to inefficiencies and delays. In this paper, we introduce Human-A ware T ask Allocation (HA T A). This method leverages Maps of Dynamics (MoDs), spatio-temporal queryable models designed to capture historical human movement patterns, to estimate the impact of humans on the task execution time during deployment. HA T A utilizes a stochastic cost function that includes MoDs Experimental results show that integrating MoDs enhances task allocation performance, resulting in reduced mission completion times by up to 26% compared to the dynamics-agnostic method and up to 19% compared to the baseline. This work underscores the importance of considering human dynamics in MRT A within shared environments and presents an efficient framework for deploying multi-robot systems in environments populated by humans.
A Lightweight Crowd Model for Robot Social Navigation
Eskeri, Maryam Kazemi, Wiedemann, Thomas, Kyrki, Ville, Baumann, Dominik, Kucner, Tomasz Piotr
Robots operating in human-populated environments must navigate safely and efficiently while minimizing social disruption. Achieving this requires estimating crowd movement to avoid congested areas in real-time. Traditional microscopic models struggle to scale in dense crowds due to high computational cost, while existing macroscopic crowd prediction models tend to be either overly simplistic or computationally intensive. In this work, we propose a lightweight, real-time macroscopic crowd prediction model tailored for human motion, which balances prediction accuracy and computational efficiency. Our approach simplifies both spatial and temporal processing based on the inherent characteristics of pedestrian flow, enabling robust generalization without the overhead of complex architectures. We demonstrate a 3.6 times reduction in inference time, while improving prediction accuracy by 3.1 %. Integrated into a socially aware planning framework, the model enables efficient and socially compliant robot navigation in dynamic environments. This work highlights that efficient human crowd modeling enables robots to navigate dense environments without costly computations.
Belief-Conditioned One-Step Diffusion: Real-Time Trajectory Planning with Just-Enough Sensing
Puthumanaillam, Gokul, Penumarti, Aditya, Vora, Manav, Padrao, Paulo, Fuentes, Jose, Bobadilla, Leonardo, Shin, Jane, Ornik, Melkior
Robots equipped with rich sensor suites can localize reliably in partially-observable environments, but powering every sensor continuously is wasteful and often infeasible. Belief-space planners address this by propagating pose-belief covariance through analytic models and switching sensors heuristically--a brittle, runtime-expensive approach. Data-driven approaches--including diffusion models--learn multi-modal trajectories from demonstrations, but presuppose an accurate, always-on state estimate. We address the largely open problem: for a given task in a mapped environment, which \textit{minimal sensor subset} must be active at each location to maintain state uncertainty \textit{just low enough} to complete the task? Our key insight is that when a diffusion planner is explicitly conditioned on a pose-belief raster and a sensor mask, the spread of its denoising trajectories yields a calibrated, differentiable proxy for the expected localisation error. Building on this insight, we present Belief-Conditioned One-Step Diffusion (B-COD), the first planner that, in a 10 ms forward pass, returns a short-horizon trajectory, per-waypoint aleatoric variances, and a proxy for localisation error--eliminating external covariance rollouts. We show that this single proxy suffices for a soft-actor-critic to choose sensors online, optimising energy while bounding pose-covariance growth. We deploy B-COD in real-time marine trials on an unmanned surface vehicle and show that it reduces sensing energy consumption while matching the goal-reach performance of an always-on baseline.
Direction Informed Trees (DIT*): Optimal Path Planning via Direction Filter and Direction Cost Heuristic
Zhang, Liding, Chen, Kejia, Cai, Kuanqi, Zhang, Yu, Dang, Yixuan, Wu, Yansong, Bing, Zhenshan, Wu, Fan, Haddadin, Sami, Knoll, Alois
Optimal path planning requires finding a series of feasible states from the starting point to the goal to optimize objectives. Popular path planning algorithms, such as Effort Informed Trees (EIT*), employ effort heuristics to guide the search. Effective heuristics are accurate and computationally efficient, but achieving both can be challenging due to their conflicting nature. This paper proposes Direction Informed Trees (DIT*), a sampling-based planner that focuses on optimizing the search direction for each edge, resulting in goal bias during exploration. We define edges as generalized vectors and integrate similarity indexes to establish a directional filter that selects the nearest neighbors and estimates direction costs. The estimated direction cost heuristics are utilized in edge evaluation. This strategy allows the exploration to share directional information efficiently. DIT* convergence faster than existing single-query, sampling-based planners on tested problems in R^4 to R^16 and has been demonstrated in real-world environments with various planning tasks. A video showcasing our experimental results is available at: https://youtu.be/2SX6QT2NOek
Uncertainty-Resilient Active Intention Recognition for Robotic Assistants
Saborío, Juan Carlos, Vinci, Marc, Lima, Oscar, Stock, Sebastian, Niecksch, Lennart, Günther, Martin, Sung, Alexander, Hertzberg, Joachim, Atzmüller, Martin
-- Purposeful behavior in robotic assistants requires the integration of multiple components and technological advances. Often, the problem is reduced to recognizing explicit prompts, which limits autonomy, or is oversimplified through assumptions such as near-perfect information. We argue that a critical gap remains unaddressed - specifically, the challenge of reasoning about the uncertain outcomes and perception errors inherent to human intention recognition. In response, we present a framework designed to be resilient to uncertainty and sensor noise, integrating real-time sensor data with a combination of planners. Our integrated framework has been successfully tested on a physical robot with promising results. Robotic assistants may be integrated into modern industrial environments, e.g., delivering tools, parts or modules interleaved with tidying the workspace. Such tasks, however, require a combination of robust planning, navigation, grasping, and perception-particularly when explicit commands are not available and the robot must identify and pursue goals, in collaborative spaces shared with people.
Enhanced UAV Path Planning Using the Tangent Intersection Guidance (TIG) Algorithm
Cheriet, Hichem, Badra, Khellat Kihel, Samira, Chouraqui
Efficient and safe navigation of Unmanned Aerial Vehicles (UAVs) is critical for various applications, including combat support, package delivery and Search and Rescue Operations. This paper introduces the Tangent Intersection Guidance (TIG) algorithm, an advanced approach for UAV path planning in both static and dynamic environments. The algorithm uses the elliptic tangent intersection method to generate feasible paths. It generates two sub-paths for each threat, selects the optimal route based on a heuristic rule, and iteratively refines the path until the target is reached. Considering the UAV kinematic and dynamic constraints, a modified smoothing technique based on quadratic Bézier curves is adopted to generate a smooth and efficient route. Experimental results show that the TIG algorithm can generate the shortest path in less time, starting from 0.01 seconds, with fewer turning angles compared to A*, PRM, RRT*, Tangent Graph, and Static APPATT algorithms in static environments. Furthermore, in completely unknown and partially known environments, TIG demonstrates efficient real-time path planning capabilities for collision avoidance, outperforming APF and Dynamic APPATT algorithms.
Weisfeiler-Leman Features for Planning: A 1,000,000 Sample Size Hyperparameter Study
Weisfeiler-Leman Features (WLFs) are a recently introduced classical machine learning tool for learning to plan and search. They have been shown to be both theoretically and empirically superior to existing deep learning approaches for learning value functions for search in symbolic planning. In this paper, we introduce new WLF hyperparameters and study their various tradeoffs and effects. We utilise the efficiency of WLFs and run planning experiments on single core CPUs with a sample size of 1,000,000 to understand the effect of hyperparameters on training and planning. Our experimental analysis show that there is a robust and best set of hyperparameters for WLFs across the tested planning domains. We find that the best WLF hyperparameters for learning heuristic functions minimise execution time rather than maximise model expressivity. We further statistically analyse and observe no significant correlation between training and planning metrics.