A Theoretically Grounded Application of Dropout in Recurrent Neural Networks

Neural Information Processing Systems 

Recurrent neural networks (RNNs) stand at the forefront of many recent developments in deep learning. Yet a major difficulty with these models is their tendency to overfit, with dropout shown to fail when applied to recurrent layers.