Scaling Laws for Optimal Data Mixtures

Open in new window