Scaling Laws Revisited: Modeling the Role of Data Quality in Language Model Pretraining

Open in new window