AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs

Open in new window