Deep linear networks for regression are implicitly regularized towards flat minima

Open in new window