Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks

Open in new window