Multi-level and multi-modal feature fusion for accurate 3D object detection in Connected and Automated Vehicles