Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes

Open in new window