Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models