Do Vision-Language Models Understand Compound Nouns?

Open in new window