Reviews: On the Convergence Rate of Training Recurrent Neural Networks

Neural Information Processing Systems 

This paper proves poly-time convergence of SGD/GD in over-parametrized RNNs for the first time. Given that there is not many theoretical results in this space. All reviewers find this result a significant progress.