Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging

Open in new window