No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models

Neural Information Processing Systems 

We study cultural and socioeconomic diversity in contrastive vision-language models (VLMs). Using a broad range of benchmark datasets and evaluation metrics, we bring to attention several important findings.