Can VLMs Recall Factual Associations From Visual References?

Open in new window