Text-To-4D Dynamic Scene Generation

Singer, Uriel, Sheynin, Shelly, Polyak, Adam, Ashual, Oron, Makarov, Iurii, Kokkinos, Filippos, Goyal, Naman, Vedaldi, Andrea, Parikh, Devi, Johnson, Justin, Taigman, Yaniv

Jan-26-2023–arXiv.org Artificial Intelligence

We present MAV3D (Make-A-Video3D), a Generative models have seen tremendous recent progress, method for generating three-dimensional dynamic and can now generate realistic images from natural language scenes from text descriptions. Our approach uses prompts (Ramesh et al., 2022; Gafni et al., 2022; Rombach a 4D dynamic Neural Radiance Field (NeRF), et al., 2022; Saharia et al., 2022; Yu et al., 2022; Sheynin which is optimized for scene appearance, density, et al., 2022). This success has been extended beyond and motion consistency by querying a Text-to-2D images both temporally to synthesize videos (Singer Video (T2V) diffusion-based model. The dynamic et al., 2022; Ho et al., 2022) and spatially to produce 3D video output generated from the provided text can shapes (Poole et al., 2022; Lin et al., 2022; Nichol et al., be viewed from any camera location and angle, 2022b). However, these two categories of generative models and can be composited into any 3D environment.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

Jan-26-2023

arXiv.org PDF

Add feedback

Country:
- Asia > Japan > Honshū > Chūbu
  - Ishikawa Prefecture > Kanazawa (0.04)
  - Nagano Prefecture > Nagano (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language (1.00)
  - Machine Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found