Mojito: Motion Trajectory and Intensity Control for Video Generation
He, Xuehai, Wang, Shuohang, Yang, Jianwei, Wu, Xiaoxia, Wang, Yiping, Wang, Kuan, Zhan, Zheng, Ruwase, Olatunji, Shen, Yelong, Wang, Xin Eric
–arXiv.org Artificial Intelligence
Recent advancements in diffusion models have shown great promise in producing high-quality video content. However, efficiently training diffusion models capable of integrating directional guidance and controllable motion intensity remains a challenging and under-explored area. This paper introduces Mojito, a diffusion model that incorporates both \textbf{Mo}tion tra\textbf{j}ectory and \textbf{i}ntensi\textbf{t}y contr\textbf{o}l for text to video generation. Specifically, Mojito features a Directional Motion Control module that leverages cross-attention to efficiently direct the generated object's motion without additional training, alongside a Motion Intensity Modulator that uses optical flow maps generated from videos to guide varying levels of motion intensity. Extensive experiments demonstrate Mojito's effectiveness in achieving precise trajectory and intensity control with high computational efficiency, generating motion patterns that closely match specified directions and intensities, providing realistic dynamics that align well with natural motion in real-world scenarios.
arXiv.org Artificial Intelligence
Dec-12-2024
- Country:
- North America > United States
- New York > New York County
- New York City (0.04)
- California > Santa Cruz County
- Santa Cruz (0.04)
- New York > New York County
- Europe
- Sweden > Halland County
- Halmstad (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Sweden > Halland County
- North America > United States
- Genre:
- Research Report (0.64)
- Technology: