Supplementary Materials for " Every View Counts: Cross-View Consistency in 3D Object Detection with Hybrid-Cylindrical-Spherical Voxelization "
In this document we provide more details about implementation and experiments about different voxelization methods. For z we set 12 bins and range [ 5, 3]. We set 12 bins and range [ 3, 3] for log l, log, w, log h. We use code weight 0.5 for velocity prediction and 1.0 for other bounding box statistics. In the classification head, we set alpha=0.25, gamma=2.0 for Focal Loss [1].
f2fc990265c712c49d51a18a32b39f0c-AuthorFeedback.pdf
We thank all the reviewers for their efforts and thoughtful feedbacks. Moreover, we provided in-depth analysis and clear ablation studies to validate our contributions. Our algorithm runs at 8 FPS with a single V100 GPU on Waymo Open Dataset. The experiments are conducted on NuScenes validation set. MB of paramters (18% more) in our proposed CVCNet.