CLoVe: Encoding Compositional Language in Contrastive Vision-Language Models

Open in new window