Spotlighting Partially Visible Cinematic Language for Video-to-Audio Generation via Self-distillation

Open in new window