CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

Open in new window