ATLAS Navigator: Active Task-driven LAnguage-embedded Gaussian Splatting
Ong, Dexter, Tao, Yuezhan, Murali, Varun, Spasojevic, Igor, Kumar, Vijay, Chaudhari, Pratik
–arXiv.org Artificial Intelligence
The module also clusters features based on geometry and semantics in the map. The hierarchical mapper [B] runs bottom-up, ingesting the RGB and depth images and the odometric path from the robot to build a map. The top level of the map contains the submaps, the middle level the regions, and the bottom level the objects. The local map compsises the loaded submaps. The other submaps are unloaded to save memory (shown here in gray). The planning module [C] consists of a discrete planner that operates on the sparse map and generates a reference path, while the dense Gaussians in the local map are used to find the trajectory to be executed on the robot. Abstract --We address the challenge of task-oriented navigation in unstructured and unknown environments, where robots must incrementally build and reason on rich, metric-semantic maps in real time. Since tasks may require clarification or re-specification, it is necessary for the information in the map to be rich enough to enable generalization across a wide range of tasks. T o effectively execute tasks specified in natural language, we propose a hierarchical representation built on language-embedded Gaussian splatting that enables both sparse semantic planning that lends itself to online operation and dense geometric representation for collision-free navigation. We validate the effectiveness of our method through real-world robot experiments conducted in both cluttered indoor and kilometer-scale outdoor environments, with a competitive ratio of about 60% against privileged baselines. Experiment videos and more details can be found on our project page: https://atlasnav.github.io This, in turn, requires robots to autonomously perceive their surroundings, gather relevant information, and make safe and efficient decisions - capabilities crucial for a variety of open-world tasking approaches over kilometer-scale environments with sparse semantics . To enable these capabilities on-board robots with privacy & compute constraints, we develop a framework to efficiently store and plan on hierarchical metric-semantic maps with visual and inertial sensors only. An overview of our method is shown in Figure 1. A cornerstone of autonomous navigation is the creation of actionable maps that effectively represent the environment and support diverse navigation and task-specific operations. These properties collectively ensure that the proposed map is not only manageable but also capable of supporting large-scale autonomous navigation to complete tasks provided in natural language. To achieve these goals, we propose an agglomerative data structure that is consistent across both geometric and semantic scales built upon 3D Gaussian Splatting [5] (3DGS).
arXiv.org Artificial Intelligence
Feb-27-2025
- Country:
- Europe > Switzerland > Zürich > Zürich (0.14)
- Genre:
- Research Report > New Finding (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Cognitive Science > Problem Solving (0.55)
- Machine Learning
- Natural Language > Text Processing (0.67)
- Robots (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence