Aux-Think: Exploring Reasoning Strategies for Data-Efficient Vision-Language Navigation 1,3 1 1 1 3 Shuo Wang, Y

Jun-15-2026, 22:40:14 GMT–Neural Information Processing Systems

Vision-Language Navigation (VLN) is a critical task for developing embodied agents that can follow natural language instructions to navigate in complex realworld environments. Recent advances driven by large pretrained models have significantly improved generalization and instruction grounding compared to traditional approaches. However, reasoning strategies in this task remain underexplored. Navigation is action-centric and long-horizon, while Chain-of-Thought (CoT) reasoning has mainly shown success in static tasks such as visual question answering. To address this gap, we conduct the first systematic evaluation of reasoning strategies, including No-Think (direct action prediction), Pre-Think (reasoning before action), and Post-Think (reasoning after action). Surprisingly, our findings reveal a Test-time Reasoning Collapse issue, where reasoning during testing degrades navigation accuracy, highlighting the challenges of integrating reasoning into embodied navigation.

artificial intelligence, natural language, reasoning, (17 more...)

Neural Information Processing Systems

Jun-15-2026, 22:40:14 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.14)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language (1.00)
  - Cognitive Science > Problem Solving (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found