Learning to Drive Anywhere with Model-Based Reannotation

Hirose, Noriaki, Ignatova, Lydia, Stachowicz, Kyle, Glossop, Catherine, Levine, Sergey, Shah, Dhruv

arXiv.org Artificial Intelligence 

Figure 1: We train a highly generalizable navigation policy that can control robots in a variety of conditions and be deployed zero-shot in new environments across the world. Our proposed method, Model-Based ReAnnotation, enables imitation learning from noisy, passive data, such as low-quality crowd-sourced demonstrations or even videos from the web. Abstract--Developing broadly generalizable visual navigation policies for robots is a significant challenge, primarily constrained by the availability of large-scale, diverse training data. While curated datasets collected by researchers offer high quality, their limited size restricts policy generalization. T o overcome this, we explore leveraging abundant, passively collected data sources, including large volumes of crowd-sourced teleoperation data and unlabeled Y ouT ube videos, despite their potential for lower quality or missing action labels. We propose Model-Based ReAnnotation (MBRA), a framework that utilizes a learned short-horizon, model-based expert model to relabel or generate high-quality actions for these passive datasets. This relabeled data is then distilled into LogoNav, a long-horizon navigation policy conditioned on visual goals or GPS waypoints. We demonstrate that LogoNav, trained using MBRA-processed data, achieves state-of-the-art performance, enabling robust navigation over distances exceeding 300 meters in previously unseen indoor and outdoor environments.