Prompting Large Vision-Language Models for Compositional Reasoning

Open in new window