MonoNav: MAV Navigation via Monocular Depth Estimation and Reconstruction
Simon, Nathaniel, Majumdar, Anirudha
–arXiv.org Artificial Intelligence
A major challenge in deploying the smallest of Micro Aerial Vehicle (MAV) platforms (< 100 g) is their inability to carry sensors that provide high-resolution metric depth information (e.g., LiDAR or stereo cameras). Current systems rely on end-to-end learning or heuristic approaches that directly map images to control inputs, and struggle to fly fast in unknown environments. In this work, we ask the following question: using only a monocular camera, optical odometry, and offboard computation, can we create metrically accurate maps to leverage the powerful path planning and navigation approaches employed by larger state-of-the-art robotic systems to achieve robust autonomy in unknown environments? We present MonoNav: a fast 3D reconstruction and navigation stack for MAVs that leverages recent advances in depth prediction neural networks to enable metrically accurate 3D scene reconstruction from a stream of monocular images and poses. MonoNav uses off-the-shelf pre-trained monocular depth estimation and fusion techniques to construct a map, then searches over motion primitives to plan a collision-free trajectory to the goal. In extensive hardware experiments, we demonstrate how MonoNav enables the Crazyflie (a 37 g MAV) to navigate fast (0.5 m/s) in cluttered indoor environments. We evaluate MonoNav against a state-of-the-art end-to-end approach, and find that the collision rate in navigation is significantly reduced (by a factor of 4). This increased safety comes at the cost of conservatism in terms of a 22% reduction in goal completion.
arXiv.org Artificial Intelligence
Nov-23-2023
- Country:
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Genre:
- Research Report (0.82)
- Technology:
- Information Technology > Artificial Intelligence
- Robots (1.00)
- Machine Learning (1.00)
- Representation & Reasoning > Planning & Scheduling (0.87)
- Vision > Image Understanding (0.65)
- Information Technology > Artificial Intelligence