Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
–Neural Information Processing Systems
The Video-to-Audio (V2A) model has recently gained attention for its practical application in generating audio directly from silent videos, particularly in video/film production. However, previous methods in V2A have limited generation quality in terms of temporal synchronization and audio-visual relevance.
Neural Information Processing Systems
Mar-27-2025, 14:24:05 GMT