Can SGD Learn Recurrent Neural Networks with Provable Generalization?

Open in new window