MMVA: Multimodal Matching Based on Valence and Arousal across Images, Music, and Musical Captions

Open in new window