Powerpropagation: A sparsity inducing weight reparameterisation

Jan-19-2025, 13:09:55 GMT–Neural Information Processing Systems

The training of sparse neural networks is becoming an increasingly important tool for reducing the computational footprint of models at training and evaluation, as well enabling the effective scaling up of models. Whereas much work over the years has been dedicated to specialised pruning techniques, little attention has been paid to the inherent effect of gradient based training on model sparsity. Inthis work, we introduce Powerpropagation, a new weight-parameterisation for neural networks that leads to inherently sparse models. Exploiting the behaviour of gradient descent, our method gives rise to weight updates exhibiting a "rich get richer" dynamic, leaving low-magnitude parameters largely unaffected by learning. Models trained in this manner exhibit similar performance, but have a distributionwith markedly higher density at zero, allowing more parameters to be pruned safely.

powerpropagation, sparsity, weight reparameterisation, (1 more...)

Neural Information Processing Systems

Jan-19-2025, 13:09:55 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)