Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws

Open in new window