Reviews: Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
–Neural Information Processing Systems
The paper proposes a method for pruning deep networks based on the largest values of the gradient vector. The idea is new compared to previous attempts; although it is somewhat related to Fisher pruning, that is also based on magnitudes of gradients, the method here is more of an SGD variant rather than a post-training evaluation method. The techniques do not come with rigorous guarantees, but the reviewers agree that the experiments and surrounding studies are interesting enough to incite future research around this method.
Neural Information Processing Systems
Feb-5-2025, 08:22:28 GMT
- Technology: