Scaling Law with Learning Rate Annealing

Open in new window