Goto

Collaborating Authors

 osmag


osmAG-LLM: Zero-Shot Open-Vocabulary Object Navigation via Semantic Maps and Large Language Models Reasoning

arXiv.org Artificial Intelligence

Recent open-vocabulary robot mapping methods enrich dense geometric maps with pre-trained visual-language features, achieving a high level of detail and guiding robots to find objects specified by open-vocabulary language queries. While the issue of scalability for such approaches has received some attention, another fundamental problem is that high-detail object mapping quickly becomes outdated, as objects get moved around a lot. In this work, we develop a mapping and navigation system for object-goal navigation that, from the ground up, considers the possibilities that a queried object can have moved, or may not be mapped at all. Instead of striving for high-fidelity mapping detail, we consider that the main purpose of a map is to provide environment grounding and context, which we combine with the semantic priors of LLMs to reason about object locations and deploy an active, online approach to navigate to the objects. Through simulated and real-world experiments we find that our approach tends to have higher retrieval success at shorter path lengths for static objects and by far outperforms prior approaches in cases of dynamic or unmapped object queries. We provide our code and dataset at: https://anonymous.4open.science/r/osmAG-LLM.


Empowering Robotics with Large Language Models: osmAG Map Comprehension with LLMs

arXiv.org Artificial Intelligence

Recently, Large Language Models (LLMs) have demonstrated great potential in robotic applications by providing essential general knowledge for situations that can not be pre-programmed beforehand. Generally speaking, mobile robots need to understand maps to execute tasks such as localization or navigation. In this letter, we address the problem of enabling LLMs to comprehend Area Graph, a text-based map representation, in order to enhance their applicability in the field of mobile robotics. Area Graph is a hierarchical, topometric semantic map representation utilizing polygons to demark areas such as rooms, corridors or buildings. In contrast to commonly used map representations, such as occupancy grid maps or point clouds, osmAG (Area Graph in OpensStreetMap format) is stored in a XML textual format naturally readable by LLMs. Furthermore, conventional robotic algorithms such as localization and path planning are compatible with osmAG, facilitating this map representation comprehensible by LLMs, traditional robotic algorithms and humans. Our experiments show that with a proper map representation, LLMs possess the capability to understand maps and answer queries based on that understanding. Following simple fine-tuning of LLaMA2 models, it surpassed ChatGPT-3.5 in tasks involving topology and hierarchy understanding. Our dataset, dataset generation code, fine-tuned LoRA adapters can be accessed at https://github.com/xiefujing/LLM-osmAG-Comprehension.


osmAG: Hierarchical Semantic Topometric Area Graph Maps in the OSM Format for Mobile Robotics

arXiv.org Artificial Intelligence

Maps are essential to mobile robotics tasks like localization and planning. We propose the open street map (osm) XML based Area Graph file format to store hierarchical, topometric semantic multi-floor maps of indoor and outdoor environments, since currently no such format is popular within the robotics community. Building on-top of osm we leverage the available open source editing tools and libraries of osm, while adding the needed mobile robotics aspect with building-level obstacle representation yet very compact, topometric data that facilitates planning algorithms. Through the use of common osm keys as well as custom ones we leverage the power of semantic annotation to enable various applications. For example, we support planning based on robot capabilities, to take the locomotion mode and attributes in conjunction with the environment information into account. The provided C++ library is integrated into ROS. We evaluate the performance of osmAG using real data in a global path planning application on a very big osmAG map, demonstrating its convenience and effectiveness for mobile robots.