Reviews: The Reversible Residual Network: Backpropagation Without Storing Activations

Oct-8-2024, 13:30:00 GMT–Neural Information Processing Systems

The authors introduce "RevNets", which avoid storing (some) activations by utilizing computational blocks that are trivial to invert (i.e. Revnets match the performance of ResNets with the same number of parameters, and in practice RevNets appear to save 4X in storage at the cost of a 2X increase in computation. Interestingly, the reversible blocks are also volume preserving, which is not explicitly discussed, but should be, because this is a potential limitation. The approach of reconstructing activations rather than storing them is only applicable to invertible layers, and so while requiring only O(1) storage for invertible layers, succeeds in only a 4X gain in storage requirements (which is nevertheless impressive). One concern I have is that the recent work on decoupled neural interfaces (DNI) is not adequately discussed or compared to (DNI also requires O(1) storage, and estimates error signals [and optionally input values] analogously to how value functions are learned in reinforcement learning).

backpropagation, reversible residual network, storing activation, (6 more...)

Neural Information Processing Systems

Oct-8-2024, 13:30:00 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.40)