Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging

Open in new window