Backprop with Approximate Activations for Memory-efficient Network Training

Ayan Chakrabarti, Benjamin Moseley

Neural Information Processing Systems 

Training convolutional neural network models is memory intensive since back-propagation requires storing activations of all intermediate layers.