Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts Guilin Li
–Neural Information Processing Systems
By transferring both features and gradients between different layers, shortcut connections explored by ResNets allow us to effectively train very deep neural networks up to hundreds of layers. However, the additional computation costs induced by those shortcuts are often overlooked.
Neural Information Processing Systems
Oct-3-2025, 02:41:02 GMT