AITopics

Technology: Information Technology > Artificial Intelligence (0.98)

Neural Information Processing SystemsDec-23-2025, 21:52:21 GMT

Object Goal Navigation using Goal-Oriented Semantic Exploration

This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments. End-to-end learning-based navigation methods struggle at this task as they are ineffective at exploration and long-term planning. We propose a modular system called, `Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently based on the goal object category. Empirical results in visually realistic simulation environments show that the proposed model outperforms a wide range of baselines including end-to-end learning-based methods as well as modular map-based methods and led to the winning entry of the CVPR-2020 Habitat ObjectNav Challenge. Ablation analysis indicates that the proposed model learns semantic priors of the relative arrangement of objects in a scene, and uses them to explore efficiently. Domain-agnostic module design allows us to transfer our model to a mobile robot platform and achieve similar performance for object goal navigation in the real-world.

goal-oriented semantic exploration, name change, object goal navigation, (3 more...)

Technology: Information Technology > Artificial Intelligence > Robots (0.60)

arXiv.org Artificial IntelligenceNov-11-2025

PanoNav: Mapless Zero-Shot Object Navigation with Panoramic Scene Parsing and Dynamic Memory

Jin, Qunchao, Wu, Yilin, Chen, Changhao

Zero-shot object navigation (ZSON) in unseen environments remains a challenging problem for household robots, requiring strong perceptual understanding and decision-making capabilities. While recent methods leverage metric maps and Large Language Models (LLMs), they often depend on depth sensors or prebuilt maps, limiting the spatial reasoning ability of Multimodal Large Language Models (MLLMs). Map-less ZSON approaches have emerged to address this, but they typically make short-sighted decisions, leading to local deadlocks due to a lack of historical context. We propose PanoNav, a fully RGB-only, mapless ZSON framework that integrates a Panoramic Scene Parsing module to unlock the spatial parsing potential of MLLMs from panoramic RGB inputs, and a Memory-guided Decision-Making mechanism enhanced by a Dynamic Bounded Memory Queue to incorporate exploration history and avoid local deadlocks. Experiments on the public navigation benchmark show that PanoNav significantly outperforms representative baselines in both SR and SPL metrics.

large language model, natural language, navigation, (14 more...)

2511.0684

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Neural Information Processing SystemsMay-26-2025, 19:22:57 GMT

Object Goal Navigation using Goal-Oriented Semantic Exploration

This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments. End-to-end learning-based navigation methods struggle at this task as they are ineffective at exploration and long-term planning. We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently based on the goal object category. Empirical results in visually realistic simulation environments show that the proposed model outperforms a wide range of baselines including end-to-end learning-based methods as well as modular map-based methods and led to the winning entry of the CVPR-2020 Habitat ObjectNav Challenge. Ablation analysis indicates that the proposed model learns semantic priors of the relative arrangement of objects in a scene, and uses them to explore efficiently.

artificial intelligence, goal-oriented semantic exploration, object goal navigation, (1 more...)

Technology: Information Technology > Artificial Intelligence (1.00)

arXiv.org Artificial IntelligenceMar-26-2025

LGR: LLM-Guided Ranking of Frontiers for Object Goal Navigation

Uno, Mitsuaki, Tanaka, Kanji, Iwata, Daiki, Noda, Yudai, Miyazaki, Shoya, Terashima, Kouki

Object Goal Navigation (OGN) is a fundamental task for robot s and AI, with key applications such as mobile robot image databases (MRID). In particular, mapless OGN is essential i n scenarios involving unknown or dynamic environments. Thi s study aims to enhance recent modular mapless OGN systems by l everaging the commonsense reasoning capabilities of large language models (LLMs). Specifically, we address the challe nge of determining the visiting order in frontier-based exp loration by framing it as a frontier ranking problem. Our approach is g rounded in recent findings that, while LLMs cannot determine the absolute value of a frontier, they excel at evaluating the re lative value between multiple frontiers viewed within a sin gle image using the view image as context. We dynamically manage the fr ontier list by adding and removing elements, using an LLM as a ranking model. The ranking results are represented as re ciprocal rank vectors, which are ideal for multi-view, mult i-query information fusion. Object Goal Navigation (OGN) is a task in which a robot explor es and locates a user-specified object within a workspace, widely studied in robotics and artificial intelligence [1]. If object locations are pre-recorded on a map, the most effici ent method is to retrieve the object from the mobile robot image d atabase [2]-[4]. However, in unknown environments or when map information is unreliable, mapless OGN is essential. Ex isting OGN methods include end-to-end approaches, which directly generate action commands from sensor data [5], but these require extensive training data and high computation al costs.

large language model, machine learning, natural language, (17 more...)

2503.20241

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > Switzerland (0.04)
Asia > Japan (0.04)

Genre: Research Report > New Finding (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsJan-23-2025, 00:59:09 GMT

Review for NeurIPS paper: Object Goal Navigation using Goal-Oriented Semantic Exploration

Summary and Contributions: This paper presents an extension to recent work on Active Neural SLAM [1], where semantic information about object categories is explicitly incorporated into the model. The extensions in the model architecture provide explicit semantic information about the various objects of the scene in the generated 2D map, that allows an agent to navigate in its environment and find a specified goal object much efficiently compared to baselines. Some of these baselines use - and others do not - semantic information. The comparison was performed using Gibson [2] and Matterport3D (MP3D) [3], which include 3D reconstructions of real environments. Training was performed on 86 scenes and testing on 16.

goal-oriented semantic exploration, object goal navigation, semantic information, (10 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.60)

Neural Information Processing SystemsJan-23-2025, 00:59:02 GMT

Review for NeurIPS paper: Object Goal Navigation using Goal-Oriented Semantic Exploration

This paper proposes to train an ObjectNav policy that generalises to unseen environments by using a modular system that classifies objects and builds an episodic semantic map, which it is uses to explore the environment based on the object category, building upon the hierarchical method in "Learning to explore using Active Neural SLAM". The method achieved SOTA performance on the 2020 CVPR Object Goal Navigation Habitat Challenge. Interestingly, the policy, trained on Gibson and MP3D, has been transferred and deployed in a real robot, with some success. While the initial reviews were mixed (9, 7, 4, 5), the reviewers converged on (8, 7, 6, 6), agreeing during discussion that the paper deserved to be accepted. Based on the reviews, I recommend this paper for acceptance as a spotlight or poster presentation.

goal-oriented semantic exploration, neurips paper, object goal navigation, (1 more...)

Technology: Information Technology > Artificial Intelligence > Robots (0.64)

arXiv.org Artificial IntelligenceOct-29-2024

Reliable Semantic Understanding for Real World Zero-shot Object Goal Navigation

Unlu, Halil Utku, Yuan, Shuaihang, Wen, Congcong, Huang, Hao, Tzes, Anthony, Fang, Yi

We introduce an innovative approach to advancing semantic understanding in zero-shot object goal navigation (ZS-OGN), enhancing the autonomy of robots in unfamiliar environments. Traditional reliance on labeled data has been a limitation for robotic adaptability, which we address by employing a dual-component framework that integrates a GLIP Vision Language Model for initial detection and an Instruction-BLIP model for validation. This combination not only refines object and environmental recognition but also fortifies the semantic interpretation, pivotal for navigational decision-making. Our method, rigorously tested in both simulated and real-world settings, exhibits marked improvements in navigation precision and reliability.

large language model, machine learning, natural language, (13 more...)

2410.21926

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.15)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York > Kings County > New York City (0.04)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)

Neural Information Processing SystemsOct-9-2024, 20:35:01 GMT

Object Goal Navigation using Goal-Oriented Semantic Exploration

This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments. End-to-end learning-based navigation methods struggle at this task as they are ineffective at exploration and long-term planning. We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently based on the goal object category. Empirical results in visually realistic simulation environments show that the proposed model outperforms a wide range of baselines including end-to-end learning-based methods as well as modular map-based methods and led to the winning entry of the CVPR-2020 Habitat ObjectNav Challenge. Ablation analysis indicates that the proposed model learns semantic priors of the relative arrangement of objects in a scene, and uses them to explore efficiently.

category, goal-oriented semantic exploration, object goal navigation

Technology: Information Technology > Artificial Intelligence (1.00)

arXiv.org Artificial IntelligenceMar-14-2024

Advancing Object Goal Navigation Through LLM-enhanced Object Affinities Transfer

Lin, Mengying, Chen, Yaran, Zhao, Dongbin, Wang, Zhaoran

In object goal navigation, agents navigate towards objects identified by category labels using visual and spatial information. Previously, solely network-based methods typically rely on historical data for object affinities estimation, lacking adaptability to new environments and unseen targets. Simultaneously, employing Large Language Models (LLMs) for navigation as either planners or agents, though offering a broad knowledge base, is cost-inefficient and lacks targeted historical experience. Addressing these challenges, we present the LLM-enhanced Object Affinities Transfer (LOAT) framework, integrating LLM-derived object semantics with network-based approaches to leverage experiential object affinities, thus improving adaptability in unfamiliar settings. LOAT employs a dual-module strategy: a generalized affinities module for accessing LLMs' vast knowledge and an experiential affinities module for applying learned object semantic relationships, complemented by a dynamic fusion module harmonizing these information sources based on temporal context. The resulting scores activate semantic maps before feeding into downstream policies, enhancing navigation systems with context-aware inputs. Our evaluations in AI2-THOR and Habitat simulators demonstrate improvements in both navigation success rates and efficiency, validating the LOAT's efficacy in integrating LLM insights for improved object goal navigation.

affinity, module, navigation, (13 more...)

2403.09971

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)