Progress-Think: Semantic Progress Reasoning for Vision-Language Navigation

Wang, Shuo, Wang, Yucheng, Lian, Guoxin, Wang, Yongcai, Chen, Maiyue, Wang, Kaihui, Zhang, Bo, Su, Zhizhong, Zhou, Yutian, Li, Wanting, Li, Deying, Fan, Zhaoxin

Nov-24-2025–arXiv.org Artificial Intelligence

Vision-Language Navigation requires agents to act coherently over long horizons by understanding not only local visual context but also how far they have advanced within a multi-step instruction. However, recent Vision-Language-Action models focus on direct action prediction and earlier progress methods predict numeric achievements; both overlook the monotonic co-progression property of the observation and instruction sequences. Building on this insight, Progress-Think introduces semantic progress reasoning, predicting instruction-style progress from visual observations to enable more accurate navigation. To achieve this without expensive annotations, we propose a three-stage framework. In the initial stage, Self-Aligned Progress Pretraining bootstraps a reasoning module via a novel differentiable alignment between visual history and instruction prefixes. Then, Progress-Guided Policy Pretraining injects learned progress states into the navigation context, guiding the policy toward consistent actions. Finally, Progress-Policy Co-Finetuning jointly optimizes both modules with tailored progress-aware reinforcement objectives. Experiments on R2R-CE and RxR-CE show state-of-the-art success and efficiency, demonstrating that semantic progress yields a more consistent representation of navigation advancement.

machine learning, natural language, navigation, (13 more...)

arXiv.org Artificial Intelligence

Nov-24-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.84)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language (1.00)
  - Machine Learning (1.00)
  - Cognitive Science > Problem Solving (1.00)
  - Vision (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found