Capacity and Trainability in Recurrent Neural Networks