Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

Open in new window