A Training Objectives Our model is trained from scratch with the semantic loss L [3], direction loss L, L
–Neural Information Processing Systems
We present the comparison of model efficiency between our CluB and the BEV-only baseline in terms of latency and FLOPs accordingly. The computational overhead of CluB is 1.2 / 1.3 times that of the BEV-only baseline in terms of latency / FLOPs. The cost is affordable since the auxiliary cluster branch is built with fully sparse operations[2]. A detailed comparison is shown in the following table. Following [4], we adopt a UNet-like architecture for learning point-wise feature representations with 3D sparse convolution and 3D sparse deconvolution on the obtained sparse voxels.
Neural Information Processing Systems
Feb-11-2025, 00:34:34 GMT