Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks