Goto

Collaborating Authors

 graph theoretic framework


A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Neural Information Processing Systems

Recomputation algorithms collectively refer to a family of methods that aims to reduce the memory consumption of the backpropagation by selectively discarding the intermediate results of the forward propagation and recomputing the discarded results as needed. In this paper, we will propose a novel and efficient recomputation method that can be applied to a wider range of neural nets than previous methods. We use the language of graph theory to formalize the general recomputation problem of minimizing the computational overhead under a fixed memory budget constraint, and provide a dynamic programming solution to the problem. Our method can reduce the peak memory consumption on various benchmark networks by $36\%\sim81\%$, which outperforms the reduction achieved by other methods.


Reviews: A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Neural Information Processing Systems

The paper proposes a method that reduces memory consumption of back prop, and as a result allows the use of larger batch sizes. The authors state that this is significant e.g. with batch norm, where batch size matters. In my own experience, I believe this is also significant for improved GPU utilization and data-parallel training. The paper is original in that I have not seen a similar treatment (although the actual solution may have been relatively straight-forward; asking the right question was an important part of this). The paper is well written and easy to follow.


Reviews: A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Neural Information Processing Systems

This work formalizes the problem of minimizing memory consumption through recomputation when performing a forward-backprop evaluation of a computation graph; it provides an optimal dynamic programming algorithm and an efficient heuristic and demonstrates strong improvements in memory savings over existing methods. Three expert reviewers initially assess the paper as 8/8/5, and the authors provided a detailed rebuttal. All reviewers took part in a discussion and the final assessment is 8/8/6, with reviewer consensus that this method is practically useful and the reported gains are strong. Overall this work is a nice contribution.


A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Neural Information Processing Systems

Recomputation algorithms collectively refer to a family of methods that aims to reduce the memory consumption of the backpropagation by selectively discarding the intermediate results of the forward propagation and recomputing the discarded results as needed. In this paper, we will propose a novel and efficient recomputation method that can be applied to a wider range of neural nets than previous methods. We use the language of graph theory to formalize the general recomputation problem of minimizing the computational overhead under a fixed memory budget constraint, and provide a dynamic programming solution to the problem. Our method can reduce the peak memory consumption on various benchmark networks by 36\%\sim81\%, which outperforms the reduction achieved by other methods.


A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Kusumoto, Mitsuru, Inoue, Takuya, Watanabe, Gentaro, Akiba, Takuya, Koyama, Masanori

Neural Information Processing Systems

Recomputation algorithms collectively refer to a family of methods that aims to reduce the memory consumption of the backpropagation by selectively discarding the intermediate results of the forward propagation and recomputing the discarded results as needed. In this paper, we will propose a novel and efficient recomputation method that can be applied to a wider range of neural nets than previous methods. We use the language of graph theory to formalize the general recomputation problem of minimizing the computational overhead under a fixed memory budget constraint, and provide a dynamic programming solution to the problem. Our method can reduce the peak memory consumption on various benchmark networks by $36\%\sim81\%$, which outperforms the reduction achieved by other methods. Papers published at the Neural Information Processing Systems Conference.