Large Language Models are Visual Reasoning Coordinators

Open in new window