Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning

Neural Information Processing Systems 

Specifically, we show that different data modalities (e.g.