SE-VLN: A Self-Evolving Vision-Language Navigation Framework Based on Multimodal Large Language Models

Open in new window