Prototypical Variational Autoencoder for 3D Few-shot Object Detection
–Neural Information Processing Systems
Few-Shot 3D Point Cloud Object Detection (FS3D) is a challenging task, aiming to detect 3D objects of novel classes using only limited annotated samples for training. Considering that the detection performance highly relies on the quality of the latent features, we design a VAE-based prototype learning scheme, named prototypical VAE (P-VAE), to learn a probabilistic latent space for enhancing the diversity and distinctiveness of the sampled features. For regularization, P-VAE incorporates a reconstruction task to preserve geometric information. To adopt P-VAE for the detection framework, we formulate Geometric-informative Prototypical VAE (GP-VAE) to handle varying geometric components and Class-specific Prototypical VAE (CP-VAE) to handle varying object categories. In the first stage, we harness GP-VAE to aid feature extraction from the input scene.
Neural Information Processing Systems
Oct-9-2024, 11:54:23 GMT
- Technology:
- Information Technology > Artificial Intelligence
- Vision (0.64)
- Representation & Reasoning (0.42)
- Machine Learning > Neural Networks (0.40)
- Information Technology > Artificial Intelligence