Reliable Semantic Understanding for Real World Zero-shot Object Goal Navigation
Unlu, Halil Utku, Yuan, Shuaihang, Wen, Congcong, Huang, Hao, Tzes, Anthony, Fang, Yi
–arXiv.org Artificial Intelligence
We introduce an innovative approach to advancing semantic understanding in zero-shot object goal navigation (ZS-OGN), enhancing the autonomy of robots in unfamiliar environments. Traditional reliance on labeled data has been a limitation for robotic adaptability, which we address by employing a dual-component framework that integrates a GLIP Vision Language Model for initial detection and an Instruction-BLIP model for validation. This combination not only refines object and environmental recognition but also fortifies the semantic interpretation, pivotal for navigational decision-making. Our method, rigorously tested in both simulated and real-world settings, exhibits marked improvements in navigation precision and reliability.
arXiv.org Artificial Intelligence
Oct-29-2024
- Country:
- Asia > Middle East
- UAE > Abu Dhabi Emirate > Abu Dhabi (0.15)
- Europe > Switzerland
- North America > United States
- New York > Kings County > New York City (0.04)
- Asia > Middle East
- Genre:
- Overview > Innovation (0.34)
- Research Report > Promising Solution (0.48)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (1.00)
- Natural Language > Large Language Model (0.87)
- Robots (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence