ST-Booster: An Iterative SpatioTemporal Perception Booster for Vision-and-Language Navigation in Continuous Environments

Yue, Lu, Zhou, Dongliang, Xie, Liang, Yin, Erwei, Zhang, Feitian

Dec-3-2025–arXiv.org Artificial Intelligence

Abstract--Vision-and-Language Navigation in Continuous Environments (VLN-CE) requires agents to navigate previously unseen and continuous spaces based on natural language instructions. Compared to discrete settings, VLN-CE poses two core perception challenges. First, the absence of predefined observation points leads to heterogeneous visual memories and weakened global spatial correlations. Second, cumulative reconstruction errors in three-dimensional scenes introduce structural noise, impairing local feature perception. T o address these challenges, this paper proposes ST -Booster, an iterative spatiotemporal booster that enhances navigation performance through multi-granularity perception and instruction-aware reasoning. ST -Booster consists of three key modules -- Hierarchical SpatioT emporal Encoding (HSTE), Multi-Granularity Aligned Fusion (MGAF), and V alue-Guided Waypoint Generation (VGWG). The resulting representations are iteratively refined through pretraining tasks. During reasoning, VGWG generates Guided Attention Heatmaps (GAHs) to explicitly model environment-instruction relevance and optimize waypoint selection. Extensive comparative experiments and performance analyses are conducted, demonstrating that ST -Booster outperforms existing state-of-the-art methods, particularly in complex, disturbance-prone environments.

machine learning, natural language, navigation, (20 more...)

arXiv.org Artificial Intelligence

Dec-3-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.69)

Genre:
- Research Report > Promising Solution (0.34)

Industry:
- Education (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language (0.88)
  - Representation & Reasoning > Spatial Reasoning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found