exploration path
HEADER: Hierarchical Robot Exploration via Attention-Based Deep Reinforcement Learning with Expert-Guided Reward
Cao, Yuhong, Wang, Yizhuo, Liang, Jingsong, Liao, Shuhao, Zhang, Yifeng, Li, Peizhuo, Sartoretti, Guillaume
Abstract--This work pushes the boundaries of learning-based methods in autonomous robot exploration in terms of environmental scale and exploration efficiency. HEADER follows existing conventional methods to construct hierarchical representations for the robot belief/map, but further designs a novel community-based algorithm to construct and update a global graph, which remains fully incremental, shape-adaptive, and operates with linear complexity. Building upon attention-based networks, our planner finely reasons about the nearby belief within the local range while coarsely leveraging distant information at the global scale, enabling next-best-viewpoint decisions that consider multi-scale spatial dependencies. Beyond novel map representation, we introduce a parameter-free privileged reward that significantly improves model performance and produces near-optimal exploration behaviors, by avoiding training objective bias caused by handcrafted reward shaping. In simulated challenging, large-scale exploration scenarios, HEADER demonstrates better scalability than most existing learning and non-learning methods, while achieving a significant improvement in exploration efficiency (up to 20%) over state-of-the-art baselines. N autonomous exploration, a mobile robot is tasked with exploring and mapping an unknown environment as fast as possible. By planning and executing its exploration path, the robot classifies unknown areas into free or obstacle areas based on its accumulated sensor measurements. In this work, we focus on tasks where a ground robot is equipped with an omnidirectional 3D LiDAR to obtain long-range, low-noise, and dense point cloud measurements. Recent advancements in LiDAR odometry have enabled accurate and robust localization and mapping in large-scale environments [1]-[3], allowing recent planners to focus on exploring the environment without concerns about mapping/localization accuracy [4]- [9]. Despite this, few planners support exploration at large scale in real-world environments [5], [10], mainly due to the complexity that comes with long-term, real-time path planning requirements. That is, to achieve efficient exploration, the planner must actively react to belief and map updates at a high frequency by (re-)reasoning about the full partial belief, to replan a long-term, non-myopic exploration path. Authors are with the Department of Mechanical Engineering, College of Design and Engineering, National University of Singapore. Example hierarchical graph constructed by HEADER during its autonomous exploration of our campus.
AI-Guided Exploration of Large-Scale Codebases
Understanding large-scale, complex software systems is a major challenge for developers, who spend a significant portion of their time on program comprehension. Traditional tools such as static visualizations and reverse engineering techniques provide structural insights but often lack interactivity, adaptability, and integration with contextual information. Recent advancements in large language models (LLMs) offer new opportunities to enhance code exploration workflows, yet their lack of grounding and integration with structured views limits their effectiveness. This work introduces a hybrid approach that integrates deterministic reverse engineering with LLM-guided, intent-aware visual exploration. The proposed system combines UML-based visualization, dynamic user interfaces, historical context, and collaborative features into an adaptive tool for code comprehension. By interpreting user queries and interaction patterns, the LLM helps developers navigate and understand complex codebases more effectively. A prototype implementation for Java demonstrates the feasibility of this approach. Future work includes empirical evaluation, scaling to polyglot systems, and exploring GUI-driven LLM interaction models. This research lays the groundwork for intelligent, interactive environments that align with developer cognition and collaborative workflows.
Action-Aware Pro-Active Safe Exploration for Mobile Robot Mapping
ฤฐลleyen, Aykut, van de Molengraft, Renรฉ, Arslan, รmรผr
Safe autonomous exploration of unknown environments is an essential skill for mobile robots to effectively and adaptively perform environmental mapping for diverse critical tasks. Due to its simplicity, most existing exploration methods rely on the standard frontier-based exploration strategy, which directs a robot to the boundary between the known safe and the unknown unexplored spaces to acquire new information about the environment. This typically follows a recurrent persistent planning strategy, first selecting an informative frontier viewpoint, then moving the robot toward the selected viewpoint until reaching it, and repeating these steps until termination. However, exploration with persistent planning may lack adaptivity to continuously updated maps, whereas highly adaptive exploration with online planning often suffers from high computational costs and potential issues with livelocks. In this paper, as an alternative to less-adaptive persistent planning and costly online planning, we introduce a new proactive preventive replanning strategy for effective exploration using the immediately available actionable information at a viewpoint to avoid redundant, uninformative last-mile exploration motion. We also use the actionable information of a viewpoint as a systematic termination criterion for exploration. To close the gap between perception and action, we perform safe and informative path planning that minimizes the risk of collision with detected obstacles and the distance to unexplored regions, and we apply action-aware viewpoint selection with maximal information utility per total navigation cost. We demonstrate the effectiveness of our action-aware proactive exploration method in numerical simulations and hardware experiments.
FSMP: A Frontier-Sampling-Mixed Planner for Fast Autonomous Exploration of Complex and Large 3-D Environments
Zhang, Shiyong, Zhang, Xuebo, Dong, Qianli, Wang, Ziyu, Xi, Haobo, Yuan, Jing
In this paper, we propose a systematic framework for fast exploration of complex and large 3-D environments using micro aerial vehicles (MAVs). The key insight is the organic integration of the frontier-based and sampling-based strategies that can achieve rapid global exploration of the environment. Specifically, a field-of-view-based (FOV) frontier detector with the guarantee of completeness and soundness is devised for identifying 3-D map frontiers. Different from random sampling-based methods, the deterministic sampling technique is employed to build and maintain an incremental road map based on the recorded sensor FOVs and newly detected frontiers. With the resulting road map, we propose a two-stage path planner. First, it quickly computes the global optimal exploration path on the road map using the lazy evaluation strategy. Then, the best exploration path is smoothed for further improving the exploration efficiency. We validate the proposed method both in simulation and real-world experiments. The comparative results demonstrate the promising performance of our planner in terms of exploration efficiency, computational time, and explored volume.
DARE: Diffusion Policy for Autonomous Robot Exploration
Cao, Yuhong, Lew, Jeric, Liang, Jingsong, Cheng, Jin, Sartoretti, Guillaume
Autonomous robot exploration requires a robot to efficiently explore and map unknown environments. Compared to conventional methods that can only optimize paths based on the current robot belief, learning-based methods show the potential to achieve improved performance by drawing on past experiences to reason about unknown areas. In this paper, we propose DARE, a novel generative approach that leverages diffusion models trained on expert demonstrations, which can explicitly generate an exploration path through one-time inference. We build DARE upon an attention-based encoder and a diffusion policy model, and introduce ground truth optimal demonstrations for training to learn better patterns for exploration. The trained planner can reason about the partial belief to recognize the potential structure in unknown areas and consider these areas during path planning. Our experiments demonstrate that DARE achieves on-par performance with both conventional and learning-based state-of-the-art exploration planners, as well as good generalizability in both simulations and real-life scenarios.
Deep Reinforcement Learning-based Large-scale Robot Exploration
Cao, Yuhong, Zhao, Rui, Wang, Yizhuo, Xiang, Bairan, Sartoretti, Guillaume
In this work, we propose a deep reinforcement learning (DRL) based reactive planner to solve large-scale Lidar-based autonomous robot exploration problems in 2D action space. Our DRL-based planner allows the agent to reactively plan its exploration path by making implicit predictions about unknown areas, based on a learned estimation of the underlying transition model of the environment. To this end, our approach relies on learned attention mechanisms for their powerful ability to capture long-term dependencies at different spatial scales to reason about the robot's entire belief over known areas. Our approach relies on ground truth information (i.e., privileged learning) to guide the environment estimation during training, as well as on a graph rarefaction algorithm, which allows models trained in small-scale environments to scale to large-scale ones. Simulation results show that our model exhibits better exploration efficiency (12% in path length, 6% in makespan) and lower planning time (60%) than the state-of-the-art planners in a 130m x 100m benchmark scenario. We also validate our learned model on hardware.
ArtPlanner: Robust Legged Robot Navigation in the Field
Wellhausen, Lorenz, Hutter, Marco
Due to the highly complex environment present during the DARPA Subterranean Challenge, all six funded teams relied on legged robots as part of their robotic team. Their unique locomotion skills of being able to step over obstacles require special considerations for navigation planning. In this work, we present and examine ArtPlanner, the navigation planner used by team CERBERUS during the Finals. It is based on a sampling-based method that determines valid poses with a reachability abstraction and uses learned foothold scores to restrict areas considered safe for stepping. The resulting planning graph is assigned learned motion costs by a neural network trained in simulation to minimize traversal time and limit the risk of failure. Our method achieves real-time performance with a bounded computation time. We present extensive experimental results gathered during the Finals event of the DARPA Subterranean Challenge, where this method contributed to team CERBERUS winning the competition. It powered navigation of four ANYmal quadrupeds for 90 minutes of autonomous operation without a single planning or locomotion failure.
Searching for Optimal Off-Line Exploration Paths in Grid Environments for a Robot with Limited Visibility
Li, Alberto Quattrini (Politecnico di Milano) | Amigoni, Francesco (Politecnico di Milano) | Basilico, Nicola (University of California, Merced)
Robotic exploration is an on-line problem in which autonomous mobile robots incrementally discover and map the physical structure of initially unknown environments. Usually, the performance of exploration strategies used to decide where to go next is not compared against the optimal performance obtainable in the test environments, because the latter is generally unknown. In this paper, we present a method to calculate an approximation of the optimal (shortest) exploration path in an arbitrary environment. We consider a mobile robot with limited visibility, discretize a two-dimensional environment with a regular grid, and formulate a search problem for finding the optimal exploration path in the grid, which is solved using A*. Experimental results show the viability of our approach for realistically large environments and its potential for better assessing the performance of on-line exploration strategies.