CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

Neural Information Processing Systems 

Work done during an internship at ByteDance San Jose, CA.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found