The Role of Linguistic Priors in Measuring Compositional Generalization of Vision-Language Models

Open in new window