Scalable Data Ablation Approximations for Language Models through Modular Training and Merging

Open in new window