Kosmos-2: Grounding Multimodal Large Language Models to the World