Neural Scaling Laws Rooted in the Data Distribution

Open in new window