Theory of Optimal Learning Rate Schedules and Scaling Laws for a Random Feature Model

Open in new window