Scalable 3D Captioning with Pretrained Models

Neural Information Processing Systems 

We introduce Cap3D, an automatic approach for generating descriptive text for 3D objects. This approach utilizes pretrained models from image captioning, image-text alignment, and LLM to consolidate captions from multiple views of a 3D asset, completely side-stepping the time-consuming and costly process of manual annotation.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found