Zero-Shot Multi-Modal Artist-Controlled Retrieval and Exploration of 3D Object Sets

Schlachter, Kristofer, Ahlbrand, Benjamin, Wang, Zhu, Ortenzi, Valerio, Perlin, Ken

Sep-1-2022–arXiv.org Artificial Intelligence

When creating 3D content, highly specialized skills are generally needed to design and generate models of objects and other assets by hand. We address this problem through high-quality 3D asset retrieval from multi-modal inputs, including 2D sketches, images and text. We use CLIP as it provides a bridge to higher-level latent features. We use these features to perform a multi-modality fusion to address the lack of artistic control that affects common data-driven approaches. Our approach allows for multi-modal conditional feature-driven retrieval through a 3D asset database, by utilizing a combination of input latent embeddings. We explore the effects of different combinations of feature embeddings across different input types and weighting methods.

multi-modal artist-controlled retrieval and exploration, retrieval, zero-shot multi-modal artist-controlled retrieval, (11 more...)

arXiv.org Artificial Intelligence

Sep-1-2022

arXiv.org PDF

Add feedback

Country:
- Europe > Denmark (0.04)
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - Illinois > Cook County
    - Chicago (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language > Large Language Model (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found