Planning & Scheduling
Enhancing Floor Plan Recognition: A Hybrid Mix-Transformer and U-Net Approach for Precise Wall Segmentation
Parashchuk, Dmitriy, Kapshitskiy, Alexey, Karyakin, Yuriy
Automatic 3D reconstruction of indoor spaces from 2D floor plans necessitates high-precision semantic segmentation of structural elements, particularly walls. However, existing methods often struggle with detecting thin structures and maintaining geometric precision. This study introduces MitUNet, a hybrid neural network combining a Mix-Transformer encoder and a U-Net decoder enhanced with spatial and channel attention blocks. Our approach, optimized with the Tversky loss function, achieves a balance between precision and recall, ensuring accurate boundary recovery. Experiments on the CubiCasa5k dataset and a proprietary regional dataset demonstrate MitUNet's superiority in generating structurally correct masks with high boundary accuracy, outperforming standard models. This tool provides a robust foundation for automated 3D reconstruction pipelines. To ensure reproducibility and facilitate future research, the source code and the proprietary regional dataset are publicly available at https://github.com/aliasstudio/mitunet and https://doi.org/10.5281/zenodo.17871079 respectively.
C*: A Coverage Path Planning Algorithm for Unknown Environments using Rapidly Covering Graphs
Shen, Zongyuan, Wilson, James P., Gupta, Shalabh
The paper presents a novel sample-based algorithm, called C*, for real-time coverage path planning (CPP) of unknown environments. C* is built upon the concept of a Rapidly Covering Graph (RCG), which is incrementally constructed during robot navigation via progressive sampling of the search space. By using efficient sampling and pruning techniques, the RCG is constructed to be a minimum-sufficient graph, where its nodes and edges form the potential waypoints and segments of the coverage trajectory, respectively. The RCG tracks the coverage progress, generates the coverage trajectory and helps the robot to escape from the dead-end situations. To minimize coverage time, C* produces the desired back-and-forth coverage pattern, while adapting to the TSP-based optimal coverage of local isolated regions, called coverage holes, which are surrounded by obstacles and covered regions. It is analytically proven that C* provides complete coverage of unknown environments. The algorithmic simplicity and low computational complexity of C* make it easy to implement and suitable for real-time on-board applications. The performance of C* is validated by 1) extensive high-fidelity simulations and 2) laboratory experiments using an autonomous robot. C* yields near optimal trajectories, and a comparative evaluation with seven existing CPP methods demonstrates significant improvements in performance in terms of coverage time, number of turns, trajectory length, and overlap ratio, while preventing the formation of coverage holes. Finally, C* is comparatively evaluated on two different CPP applications using 1) energy-constrained robots and 2) multi-robot teams.
Principles2Plan: LLM-Guided System for Operationalising Ethical Principles into Plans
Zhong, Tammy, Song, Yang, Pagnucco, Maurice
Ethical awareness is critical for robots operating in human environments, yet existing automated planning tools provide little support. Manually specifying ethical rules is labour-intensive and highly context-specific. We present Princi-ples2Plan, an interactive research prototype demonstrating how a human and a Large Language Model (LLM) can collaborate to produce context-sensitive ethical rules and guide automated planning. A domain expert provides the planning domain, problem details, and relevant high-level principles such as beneficence and privacy. The system generates op-erationalisable ethical rules consistent with these principles, which the user can review, prioritise, and supply to a planner to produce ethically-informed plans. To our knowledge, no prior system supports users in generating principle-grounded rules for classical planning contexts. Principles2Plan showcases the potential of human-LLM collaboration for making ethical automated planning more practical and feasible.
Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation
Wei, Meng, Wan, Chenyang, Peng, Jiaqi, Yu, Xiqian, Yang, Yuqiang, Feng, Delin, Cai, Wenzhe, Zhu, Chenming, Wang, Tai, Pang, Jiangmiao, Liu, Xihui
While recent large vision-language models (VLMs) have improved generalization in vision-language navigation (VLN), existing methods typically rely on end-to-end pipelines that map vision-language inputs directly to short-horizon discrete actions. Such designs often produce fragmented motions, incur high latency, and struggle with real-world challenges like dynamic obstacle avoidance. We propose DualVLN, the first dual-system VLN foundation model that synergistically integrates high-level reasoning with low-level action execution. System 2, a VLM-based global planner, "grounds slowly" by predicting mid-term waypoint goals via image-grounded reasoning. System 1, a lightweight, multi-modal conditioning Diffusion Transformer policy, "moves fast" by leveraging both explicit pixel goals and latent features from System 2 to generate smooth and accurate trajectories. The dual-system design enables robust real-time control and adaptive local decision-making in complex, dynamic environments. By decoupling training, the VLM retains its generalization, while System 1 achieves interpretable and effective local navigation. DualVLN outperforms prior methods across all VLN benchmarks and real-world experiments demonstrate robust long-horizon planning and real-time adaptability in dynamic environments.
OptMap: Geometric Map Distillation via Submodular Maximization
Thorne, David, Chan, Nathan, Robison, Christa S., Osteen, Philip R., Lopez, Brett T.
Abstract--Autonomous robots rely on geometric maps to inform a diverse set of perception and decision-making algorithms. As autonomy requires reasoning and planning on multiple scales of the environment, each algorithm may require a different map for optimal performance. Light Detection And Ranging (LiDAR) sensors generate an abundance of geometric data to satisfy these diverse requirements, but selecting informative, size-constrained maps is computationally challenging as it requires solving an NP-hard combinatorial optimization. In this work we present OptMap: a geometric map distillation algorithm which achieves real-time, application-specific map generation via multiple theoretical and algorithmic innovations. A central feature is the maximization of set functions that exhibit diminishing returns, i.e., submodularity, using polynomial-time algorithms with provably near-optimal solutions. We formulate a novel submodular reward function which quantifies informativeness, reduces input set sizes, and minimizes bias in sequentially collected datasets. Further, we propose a dynamically reordered streaming submod-ular algorithm which improves empirical solution quality and addresses input order bias via an online approximation of the value of all scans. T esting was conducted on open-source and custom datasets with an emphasis on long-duration mapping sessions, highlighting OptMap's minimal computation requirements. Open-source ROS1 and ROS2 packages are available and can be used alongside any LiDAR SLAM algorithm. ODERN autonomous systems use a modular software architecture with separate algorithms for perceiving the environment, planning collision-free paths, estimating vehicle motion, and making higher-level decisions to complete their tasks. Many of these algorithms depend on geometric information about the environment to function properly. As a result, their performance and processing time can vary greatly depending on the quality of the geometric data. For example, trajectory planners use geometric maps to plan collision-free paths, but the density of geometric data is critical for balancing real-time replanning requirements against reliable collision detection. This trade-off is best served by dense geometric maps that specifically capture the intended trajectory corridor (Figure 1a). In contrast, localization entails aligning a source and reference point cloud, a process best served by using a sparse and global reference point could to minimize computation time while maximizing alignment accuracy (Figure 1b). Distribution Statement A: Approved for public release; distribution is unlimited. Map is dense while remaining efficient as only points near the intended trajectory are returned.
Efficient Computation of a Continuous Topological Model of the Configuration Space of Tethered Mobile Robots
Battocletti, Gianpietro, Boskos, Dimitris, De Schutter, Bart
Despite the attention that the problem of path planning for tethered robots has garnered in the past few decades, the approaches proposed to solve it typically rely on a discrete representation of the configuration space and do not exploit a model that can simultaneously capture the topological information of the tether and the continuous location of the robot. In this work, we explicitly build a topological model of the configuration space of a tethered robot starting from a polygonal representation of the workspace where the robot moves. To do so, we first establish a link between the configuration space of the tethered robot and the universal covering space of the workspace, and then we exploit this link to develop an algorithm to compute a simplicial complex model of the configuration space. We show how this approach improves the performances of existing algorithms that build other types of representations of the configuration space. The proposed model can be computed in a fraction of the time required to build traditional homotopy-augmented graphs, and is continuous, allowing to solve the path planning task for tethered robots using a broad set of path planning algorithms.
Implementing Cumulative Functions with Generalized Cumulative Constraints
Schaus, Pierre, Thomas, Charles, Kameugne, Roger
Modeling scheduling problems with conditional time intervals and cumulative functions has become a common approach when using modern commercial constraint programming solvers. This paradigm enables the modeling of a wide range of scheduling problems, including those involving producers and consumers. However, it is unavailable in existing open-source solvers and practical implementation details remain undocumented. In this work, we present an implementation of this modeling approach using a single, generic global constraint called the Generalized Cumulative. We also introduce a novel time-table filtering algorithm specifically designed to handle tasks defined on conditional time-intervals. Experimental results demonstrate that this approach, combined with the new filtering algorithm, performs competitively with existing solvers enabling the modeling of producer and consumer scheduling problems and effectively scales to large-scale problems.
Exploring Adversarial Obstacle Attacks in Search-based Path Planning for Autonomous Mobile Robots
Szvoren, Adrian, Liu, Jianwei, Kanoulas, Dimitrios, Tuptuk, Nilufer
Abstract-- Path planning algorithms, such as the search-based A*, are a critical component of autonomous mobile robotics, enabling robots to navigate from a starting point to a destination efficiently and safely. We investigated the resilience of the A* algorithm in the face of potential adversarial interventions known as obstacle attacks. The adversary's goal is to delay the robot's timely arrival at its destination by introducing obstacles along its original path. We developed malicious software to execute the attacks and conducted experiments to assess their impact, both in simulation using T urtleBot in Gazebo and in real-world deployment with the Unitree Go1 robot. In simulation, the attacks resulted in an average delay of 36%, with the most significant delays occurring in scenarios where the robot was forced to take substantially longer alternative paths. In real-world experiments, the delays were even more pronounced, with all attacks successfully rerouting the robot and causing measurable disruptions. These results highlight that the algorithm's robustness is not solely an attribute of its design but is significantly influenced by the operational environment. For example, in constrained environments like tunnels, the delays were maximized due to the limited availability of alternative routes.
Real-Time Spatiotemporal Tubes for Dynamic Unsafe Sets
Das, Ratnangshu, Upadhyay, Siddhartha, Jagtap, Pushpak
This paper presents a real-time control framework for nonlinear pure-feedback systems with unknown dynamics to satisfy reach-avoid-stay tasks within a prescribed time in dynamic environments. To achieve this, we introduce a real-time spatiotemporal tube (STT) framework. An STT is defined as a time-varying ball in the state space whose center and radius adapt online using only real-time sensory input. A closed-form, approximation-free control law is then derived to constrain the system output within the STT, ensuring safety and task satisfaction. We provide formal guarantees for obstacle avoidance and on-time task completion. The effectiveness and scalability of the framework are demonstrated through simulations and hardware experiments on a mobile robot and an aerial vehicle, navigating in cluttered dynamic environments.
POrTAL: Plan-Orchestrated Tree Assembly for Lookahead
Conway, Evan, Porfirio, David, Chan, David, Roberts, Mark, Hiatt, Laura M.
Abstract-- Assigning tasks to robots often involves supplying the robot with an overarching goal, such as through natural language, and then relying on the robot to uncover and execute a plan to achieve that goal. In many settings common to human-robot interaction, however, the world is only partially observable to the robot, requiring that it create plans under uncertainty. Although many probabilistic planning algorithms exist for this purpose, these algorithms can be inefficient if executed with the robot's limited computational resources, or may require more steps than expected to achieve the goal. We thereby created a new, lightweight, probabilistic planning algorithm, Plan-Orchestrated Tree Assembly for Lookahead (POrTAL), that combines the strengths of two baseline planning algorithms, FF-Replan and POMCP . In a series of case studies, we demonstrate POrTAL's ability to quickly arrive at solutions that outperform these baselines in terms of number of steps. We additionally demonstrate how POrTAL performs under varying temporal constraints. The ability of modern robots to respond to arbitrary user requests has advanced considerably in recent years. This advancement is in large part due to robots' ability to autonomously plan their own actions. When receiving a goal such as "bring me a cup of coffee," for example, a robot can calculate the minimum number of steps required to achieve this goal: obtain the coffee grinds, proceeding to the coffee maker, load the grinds, and so on. In many scenarios common to human-robot interaction, however, this planning must be performed under considerable uncertainty.