Towards Meta-Pruning via Optimal Transport

Theus, Alexander, Geimer, Olin, Wicke, Friedrich, Hofmann, Thomas, Anagnostidis, Sotiris, Singh, Sidak Pal

Feb-12-2024–arXiv.org Artificial Intelligence

Structural pruning of neural networks conventionally relies on identifying and discarding less important neurons, a practice often resulting in significant accuracy loss that necessitates subsequent fine-tuning efforts. This paper introduces a novel approach named Intra-Fusion, challenging this prevailing pruning paradigm. Unlike existing methods that focus on designing meaningful neuron importance metrics, Intra-Fusion redefines the overlying pruning procedure. Through utilizing the concepts of model fusion and Optimal Transport, we leverage an agnostically given importance metric to arrive at a more effective sparse model representation. Notably, our approach achieves substantial accuracy recovery without the need for resource-intensive fine-tuning, making it an efficient and promising tool for neural network compression. Additionally, we explore how fusion can be added to the pruning process to significantly decrease the training time while maintaining competitive performance. We benchmark our results for various networks on commonly used datasets such as CIFAR-10, CIFAR-100, and ImageNet. More broadly, we hope that the proposed Intra-Fusion approach invigorates exploration into a fresh alternative to the predominant compression approaches. Alongside the massive progress in the past few years, modern over-parameterized neural networks have also brought another thing onto the table. That is, of course, their massive size. Consequently, as part of the community keeps training bigger networks, another community has been working, often in the background, to ensure that these bulky networks can be made compact to actually be deployed (Hassibi et al., 1993). However, despite the apparent conceptual simplicity of these techniques, compressing neural networks, in practice, is not as straightforward as simply doing one or two traditional post-processing steps (Blalock et al., 2020). The process involves a crucial element--fine-tuning or retraining, on the original dataset or a subset--extending over several additional epochs.

artificial intelligence, intra-fusion, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Feb-12-2024

arXiv.org PDF

Add feedback

Country:
- Europe > Switzerland
  - Zürich > Zürich (0.14)
- North America > United States
  - New York > New York County > New York City (0.14)

Genre:
- Research Report
  - Experimental Study (0.46)
  - New Finding (0.66)
  - Promising Solution (0.48)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)