Directional Pruning of Deep Neural Networks
–Neural Information Processing Systems
In the light of the fact that the stochastic gradient descent (SGD) often finds a flat minimum valley in the training loss, we propose a novel directional pruning method which searches for a sparse minimizer in or close to that flat region. The proposed pruning method does not require retraining or the expert knowledge on the sparsity level.
Neural Information Processing Systems
Mar-20-2025, 01:44:58 GMT