MotionV2V: Editing Motion in a Video

Burgert, Ryan, Herrmann, Charles, Cole, Forrester, Ryoo, Michael S, Wadhwa, Neal, Voynov, Andrey, Ruiz, Nataniel

arXiv.org Artificial Intelligence 

While generative video models have achieved remarkable fidelity and consistency, applying these capabilities to video editing remains a complex challenge. Recent research has extensively explored motion controllability as a means to enhance text-to-video generation or image animation; however, we identify precise motion control as a promising, yet under-explored, paradigm for editing existing videos. In this work, we propose modifying video motion by directly editing sparse trajectories extracted from the input. W e term the deviation between input and output trajectories a'motion edit' and demonstrate that this representation, when coupled with a generative backbone, enables many powerful video editing capabilities. T o achieve this, we introduce a novel pipeline for generating'motion counterfactuals' -- video pairs that share identical content but distinct motion -- and fine-tune a motion-conditioned video diffusion architecture on this dataset. Our approach allows for edits that start at any timestamp and propagate naturally. In a 4-way head-to-head user study, our model achieves over 65% preference against prior work.