MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Open in new window