Multimodal-Enhanced Objectness Learner for Corner Case Detection in Autonomous Driving
Xiao, Lixing, Shi, Ruixiao, Tang, Xiaoyang, Zhou, Yi
–arXiv.org Artificial Intelligence
Previous works on object detection have achieved high accuracy in closed-set scenarios, but their performance in open-world scenarios is not satisfactory. One of the challenging open-world problems is corner case detection in autonomous driving. Existing detectors struggle with these cases, relying heavily on visual appearance and exhibiting poor generalization ability. In this paper, we propose a solution by reducing the discrepancy between known and unknown classes and introduce a multimodal-enhanced objectness notion learner. Leveraging both vision-centric and image-text modalities, our semi-supervised learning framework imparts objectness knowledge to the student model, enabling class-aware detection. Our approach, Multimodal-Enhanced Objectness Learner (MENOL) for Corner Case Detection, significantly improves recall for novel classes with lower training costs. By achieving a 76.6% mAR-corner and 79.8% mAR-agnostic on the CODA-val dataset with just 5100 labeled training images, MENOL outperforms the baseline ORE by 71.3% and 60.6%, respectively. The code will be available at https://github.com/tryhiseyyysum/MENOL.
arXiv.org Artificial Intelligence
Feb-2-2024
- Country:
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Genre:
- Research Report (0.82)
- Industry:
- Automobiles & Trucks (0.72)
- Education (0.69)
- Information Technology > Robotics & Automation (0.72)
- Transportation > Ground
- Road (0.72)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (0.46)
- Performance Analysis > Accuracy (0.46)
- Robots > Autonomous Vehicles (0.72)
- Vision > Image Understanding (0.48)
- Machine Learning
- Information Technology > Artificial Intelligence