Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes

Neural Information Processing Systems 

We study the generalization of two-layer ReLU neural networks in a univariate nonparametric regression problem with noisy labels. This is a problem where kernels ( e.g.