XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation Ziyi Wang Y anbo Wang

Neural Information Processing Systems 

Subsequently, the generated 2D masks are employed to align mask-level 3D representations with the vision-language feature space, thereby augmenting the open vocabulary capability of 3D geometry embeddings.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found