Memory-Efficient Backpropagation Through Time
Gruslys, Audrunas, Munos, Remi, Danihelka, Ivo, Lanctot, Marc, Graves, Alex
–Neural Information Processing Systems
We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs). Our approach uses dynamic programming to balance a trade-off between caching of intermediate results and recomputation. The algorithm is capable of tightly fitting within almost any user-set memory budget while finding an optimal execution policy minimizing the computational cost. Computational devices have limited memory capacity and maximizing a computational performance given a fixed memory budget is a practical use-case. We provide asymptotic computational upper bounds for various regimes.
Neural Information Processing Systems
Mar-19-2020, 14:02:18 GMT