BEV-ODOM: Reducing Scale Drift in Monocular Visual Odometry with BEV Representation
Wei, Yufei, Lu, Sha, Han, Fuzhang, Xiong, Rong, Wang, Yue
–arXiv.org Artificial Intelligence
Abstract-- Monocular visual odometry (MVO) is vital in autonomous navigation and robotics, providing a cost-effective and flexible motion tracking solution, but the inherent scale ambiguity in monocular setups often leads to cumulative errors over time. In this paper, we present BEV-ODOM, a novel MVO framework leveraging the Bird's Eye View (BEV) Representation to address scale drift. Unlike existing approaches, BEV-ODOM integrates a depth-based perspective-view (PV) to BEV encoder, a correlation feature extraction neck, and a CNN-MLP-based decoder, enabling it to estimate motion across three degrees of freedom without the need for depth supervision or complex optimization techniques. Our framework reduces scale drift in long-term sequences and achieves accurate motion estimation across various datasets, including NCLT, Oxford, and KITTI. In contrast, our method achieves low scale Monocular visual odometry (MVO) has been of interest drift using only pose supervision with BEV representation.
arXiv.org Artificial Intelligence
Nov-15-2024