Semantic Composition in Visually Grounded Language Models

Open in new window