4+3 Phases of Compute-Optimal Neural Scaling Laws Elliot Paquette

Open in new window