Vivid-ZOO: Multi-View Video Generation with Diffusion Model

Mar-21-2026, 03:13:00 GMT–Neural Information Processing Systems

While diffusion models have shown impressive performance in 2D image/video generation, diffusion-based Text-to-Multi-view-Video (T2MVid) generation remains underexplored. The new challenges posed by T2MVid generation lie in the lack of massive captioned multi-view videos and the complexity of modeling such multi-dimensional distribution. To this end, we propose a novel diffusion-based pipeline that generates high-quality multi-view videos centered around a dynamic 3D object from text.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Mar-21-2026, 03:13:00 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.48)