Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network

Open in new window