V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models

Open in new window