Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective

Jin, Can, Huang, Tianjin, Zhang, Yihua, Pechenizkiy, Mykola, Liu, Sijia, Liu, Shiwei, Chen, Tianlong

arXiv.org Artificial Intelligence 

The rapid development of large-scale deep learning models questions the affordability of hardware platforms, which necessitates the pruning to reduce their computational and memory footprints. Sparse neural networks as the product, have demonstrated numerous favorable benefits like low complexity, undamaged generalization, etc. Most of the prominent pruning strategies are invented from a model-centric perspective, focusing on searching and preserving crucial weights by analyzing network topologies. However, the role of data and its interplay with model-centric pruning has remained relatively unexplored. In this research, we introduce a novel data-model co-design perspective: to promote superior weight sparsity by learning important model topology and adequate input data in a synergetic manner. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework. As a pioneering effort, this paper conducts systematic investigations about the impact of different visual prompts on model pruning and suggests an effective joint optimization approach. Furthermore, we find that subnetworks discovered by VPNs from pre-trained models enjoy better transferability across diverse downstream scenarios. These insights shed light on new promising possibilities of data-model co-designs for vision model sparsification. Code is available at https://github.com/UNITES-Lab/VPNs. Large-scale neural networks like vision and language models (Brown et al., 2020; Radford et al., 2019; Touvron et al., 2023; Chiang et al., 2023; Li et al., 2022; Bai et al., 2023) have attracted stupendous attention in nowadays deep learning community, which pose significantly increased demands to computing resources. While remarkable performance has been offered, they suffer from prohibitively high training and inference costs, and the deployment of these gigantic models entails substantial memory and computational overhead.