Re-Imagining Multimodal Instruction Tuning: A Representation View

Open in new window