MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation

Wang, Weimin, Liu, Jiawei, Lin, Zhijie, Yan, Jiangqiao, Chen, Shuo, Low, Chetwin, Hoang, Tuyen, Wu, Jie, Liew, Jun Hao, Yan, Hanshu, Zhou, Daquan, Feng, Jiashi

Jan-9-2024–arXiv.org Artificial Intelligence

The growing demand for high-fidelity video generation from textual descriptions has catalyzed significant research in this field. In this work, we introduce MagicVideo-V2 that integrates the text-to-image model, video motion generator, reference image embedding module and frame interpolation module into an end-to-end video generation pipeline. Benefiting from these architecture designs, MagicVideo-V2 can generate an aesthetically pleasing, high-resolution video with remarkable fidelity and smoothness. It demonstrates superior performance over leading Text-to-Video systems such as Runway, Pika 1.0, Morph, Moon Valley and Stable Video Diffusion model via user evaluation at large scale.

artificial intelligence, machine learning, magicvideo-v2, (14 more...)

arXiv.org Artificial Intelligence

Jan-9-2024

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.93)