Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

Chang, Haonan, Boyalakuntla, Kowndinya, Lu, Shiyang, Cai, Siwei, Jing, Eric, Keskar, Shreesh, Geng, Shijie, Abbas, Adeeb, Zhou, Lifeng, Bekris, Kostas, Boularias, Abdeslam

Sep-27-2023–arXiv.org Artificial Intelligence

We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries. Unlike conventional semantic-based object localization approaches, our system facilitates context-aware entity localization, allowing for queries such as ``pick up a cup on a kitchen table" or ``navigate to a sofa on which someone is sitting". In contrast to existing research on 3D scene graphs, OVSG supports free-form text input and open-vocabulary querying. Through a series of comparative experiments using the ScanNet dataset and a self-collected dataset, we demonstrate that our proposed approach significantly surpasses the performance of previous semantic-based localization techniques. Moreover, we highlight the practical application of OVSG in real-world robot navigation and manipulation experiments.

artificial intelligence, context-aware entity grounding, scene graph

arXiv.org Artificial Intelligence

Sep-27-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence > Robots (0.53)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found