LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UAVs

Zhang, Xinyuan, Tian, Yonglin, Lin, Fei, Liu, Yue, Ma, Jing, Szatmáry, Kornélia Sára, Wang, Fei-Yue

arXiv.org Artificial Intelligence 

LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UA Vs Xinyuan Zhang, Y onglin Tian, Fei Lin, Y ue Liu, Jing Ma, Korn elia S ara Szatm ary, Fei-Y ue Wang Abstract --The growing demand for intelligent logistics, particularly fine-grained terminal delivery, underscores the need for autonomous UA V (Unmanned Aerial V ehicle)-based delivery systems. However, most existing last-mile delivery studies rely on ground robots, while current UA V-based Vision-Language Navigation (VLN) tasks primarily focus on coarse-grained, long-range goals, making them unsuitable for precise terminal delivery. T o bridge this gap, we propose LogisticsVLN, a scalable aerial delivery system built on multimodal large language models (MLLMs) for autonomous terminal delivery. LogisticsVLN integrates lightweight Large Language Models (LLMs) and Visual-Language Models (VLMs) in a modular pipeline for request understanding, floor localization, object detection, and action-decision making. T o support research and evaluation in this new setting, we construct the Vision-Language Delivery (VLD) dataset within the CARLA simulator . In addition, we conduct subtask-level evaluations of each module of our system, offering valuable insights for improving the robustness and real-world deployment of foundation model-based vision-language delivery systems. I NTRODUCTION Driven by the rapid growth of e-commerce and urbanization, logistics has become an increasingly critical component of modern society [1]. In particular, there is a growing demand for stable, efficient, and user-centric terminal delivery, This work is partly supported by the Science and Technology Development Fund, Macao SAR (File no. Xinyuan Zhang is with the School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China (e-mail: zhangxinyuan23@mails.ucas.ac.cn).