Kosmos-2: Grounding Multimodal Large Language Models to the World

Open in new window