Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers

Open in new window