Data-Agnostic Robotic Long-Horizon Manipulation with Vision-Language-Guided Closed-Loop Feedback

Meng, Yuan, Yao, Xiangtong, Ye, Haihui, Zhou, Yirui, Zhang, Shengqiang, Bing, Zhenshan, Knoll, Alois

arXiv.org Artificial Intelligence 

Our framework demonstrates state-of-the-art performance across diverse long-horizon tasks, achieving strong generalization in both simulated and real-world scenarios. Videos and code are available at https://ghiara.github.io/DAHLIA/. I. INTRODUCTION Language-conditioned robotic manipulation is an emerging field at the intersection of robotics, natural language processing, and computer vision, which aims to enable robots to interpret human commands and perform complex tasks using multi-modal sensing [1]. Imitation learning (IL) and reinforcement learning (RL) have traditionally been the dominant approaches for training robotic manipulation policies. However, recent IL and RL methods are often constrained to narrow task distributions, leading to sampling inefficiency and high sensitivity to distributional shifts, which limits their ability to generalize to diverse and complex scenarios. Additionally, both IL and RL are data-driven, requiring large-scale expert demonstrations, yet Internet-scale data collection for embodied AI remains a substantial challenge. In contrast, the natural language processing domain has seen state-of-the-art (SOT A) LLMs like GPT [2] and Llama [3] achieve humanlike semantic understanding and common sense reasoning by training on massive datasets. Within embodied AI, LLMs offer a promising solution to bridge the gap between high-level language instructions and low-level robotic control, 1 Y uan Meng, Xiangtong Y ao, Haihui Y e, Yirui Zhou, and Alois Knoll are with the School of Computation, Information and Technology, Technical University of Munich, Germany. 2 Shengqiang Zhang is with the Center for Information and Language Processing, Ludwig Maximilian University of Munich, Germany. 3 Zhenshan Bing is with the State Key Laboratory for Novel Software Technology, Nanjing University, China.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found