Early-stopped neural networks are consistent

Neural Information Processing Systems 

Rounding out the story and contributions, firstly we present a brief toy univariate model hinting towards the necessity of early stopping: concretely, any univariate predictor satisfying a local interpolation property can not achieve optimal test error for noisy distributions.