RANa: Retrieval-Augmented Navigation

Monaci, Gianluca, Rezende, Rafael S., Deffayet, Romain, Csurka, Gabriela, Bono, Guillaume, Déjean, Hervé, Clinchant, Stéphane, Wolf, Christian

Jul-30-2025–arXiv.org Artificial Intelligence

Methods for navigation based on large-scale learning typically treat each episode as a new problem, where the agent is spawned with a clean memory in an unknown environment. While these generalization capabilities to an unknown environment are extremely important, we claim that, in a realistic setting, an agent should have the capacity of exploiting information collected during earlier robot operations. We address this by introducing a new retrieval-augmented agent, trained with RL, capable of querying a database collected from previous episodes in the same environment and learning how to integrate this additional context information. We introduce a unique agent architecture for the general navigation task, evaluated on ImageNav, Instance-ImageNav and ObjectNav. Our retrieval and context encoding methods are data-driven and employ vision foundation models (FM) for both semantic and geometric understanding. We propose new benchmarks for these settings and we show that retrieval allows zero-shot transfer across tasks and environments while significantly improving performance.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Jul-30-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Industry:
- Leisure & Entertainment (0.46)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Vision (1.00)
    - Robots (1.00)
    - Representation & Reasoning > Agents (1.00)
    - Natural Language > Large Language Model (0.90)
    - Machine Learning > Neural Networks
      - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found