Learning the Visualness of Text Using Large Vision-Language Models

Open in new window