Do highly over-parameterized neural networks generalize since bad solutions are rare?

Open in new window