Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data

Neural Information Processing Systems 

However, existing point cloud-based open-vocabulary 3D detection models are limited by their high deployment costs. In this work, we propose a novel open-vocabulary monocular 3D object detection framework, dubbed OVM3D-Det, which trains detectors using only RGB images, making it both cost-effective and scalable to publicly available data.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found