VIMI: Grounding Video Generation through Multi-modal Instruction

Open in new window