Appendix for Integrating Momentum into Recurrent Neural Networks