Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs Matthieu Cord

Open in new window