Goto

Collaborating Authors

 network compression


Paraphrasing Complex Network: Network Compression via Factor Transfer

Neural Information Processing Systems

Many researchers have sought ways of model compression to reduce the size of a deep neural network (DNN) with minimal performance degradation in order to use DNNs in embedded systems. Among the model compression methods, a method called knowledge transfer is to train a student network with a stronger teacher network. In this paper, we propose a novel knowledge transfer method which uses convolutional operations to paraphrase teacher's knowledge and to translate it for the student. This is done by two convolutional modules, which are called a paraphraser and a translator. The paraphraser is trained in an unsupervised manner to extract the teacher factors which are defined as paraphrased information of the teacher network. The translator located at the student network extracts the student factors and helps to translate the teacher factors by mimicking them. We observed that our student network trained with the proposed factor transfer method outperforms the ones trained with conventional knowledge transfer methods.



Supplementary Material: Attribution Preservation in Network Compression for Reliable Network Interpretation

Neural Information Processing Systems

ImageNet class labels - the class labels are unusable. In the fine-tuning phase, the pruned network is fine-tuned for 10 epochs with batch size 180. We conduct experiments for structured pruning methods on ImageNet. We observe same tendencies in the results (Table 4). Our method outperforms naive compression in terms of maintaining the attribution maps.


Fig. R1: Loss over effective learning time for the linear problem and different values of initial connectivity strength g and target value ˆz

Neural Information Processing Systems

We thank the reviewers for their positive and constructive feedback on the paper. We also addressed all the other comments, and they will appear in the revised version. We trained an LSTM network on the NLP task of sentiment analysis (Fig. R2). As in our paper, we found that the resulting changes to the network weights are of low rank. Note that one may not observe this behavior for any task and network off the shelf.


Linearity-based neural network compression

Dobler, Silas, Lemmerich, Florian

arXiv.org Machine Learning

In neural network compression, most current methods reduce unnecessary parameters by measuring importance and redundancy. To augment already highly optimized existing solutions, we propose linearity-based compression as a novel way to reduce weights in a neural network. It is based on the intuition that with ReLU-like activation functions, neurons that are almost always activated behave linearly, allowing for merging of subsequent layers. We introduce the theory underlying this compression and evaluate our approach experimentally. Our novel method achieves a lossless compression down to 1/4 of the original model size in over the majority of tested models. Applying our method on already importance-based pruned models shows very little interference between different types of compression, demonstrating the option of successful combination of techniques. Overall, our work lays the foundation for a new type of compression method that enables smaller and ultimately more efficient neural network models.


MUC-G4: Minimal Unsat Core-Guided Incremental Verification for Deep Neural Network Compression

Li, Jingyang, Li, Guoqiang

arXiv.org Artificial Intelligence

The rapid development of deep learning has led to challenges in deploying neural networks on edge devices, mainly due to their high memory and runtime complexity. Network compression techniques, such as quantization and pruning, aim to reduce this complexity while maintaining accuracy. However, existing incremental verification methods often focus only on quantization and struggle with structural changes. This paper presents MUC-G4 (Minimal Unsat Core-Guided Incremental Verification), a novel framework for incremental verification of compressed deep neural networks. It encodes both the original and compressed networks into SMT formulas, classifies changes, and use \emph{Minimal Unsat Cores (MUCs)} from the original network to guide efficient verification for the compressed network. Experimental results show its effectiveness in handling quantization and pruning, with high proof reuse rates and significant speedup in verification time compared to traditional methods. MUC-G4 hence offers a promising solution for ensuring the safety and reliability of compressed neural networks in practical applications.


Reviews: Positive-Unlabeled Compression on the Cloud

Neural Information Processing Systems

The paper targets the application of network compression using a cloud platform. Instead of uploading all the training data onto the platform, the paper suggests uploading a small portion of data as positive (P) data and use larger datasets already on the platform as unlabeled (U) data. After training a PU classifier, the classifier will be used to select more P data from the U data. And such selected data, together with the original data, are used in a knowledge distillation framework to compress the original network. The experimental results show that the compressed network's performance is close to the original deep neural network trained on all data, on three widely used datasets.


Attribution Preservation in Network Compression for Reliable Network Interpretation

Neural Information Processing Systems

Neural networks embedded in safety-sensitive applications such as self-driving cars and wearable health monitors rely on two important techniques: input attribution for hindsight analysis and network compression to reduce its size for edge-computing. In this paper, we show that these seemingly unrelated techniques conflict with each other as network compression deforms the produced attributions, which could lead to dire consequences for mission-critical applications. This phenomenon arises due to the fact that conventional network compression methods only preserve the predictions of the network while ignoring the quality of the attributions. To combat the attribution inconsistency problem, we present a framework that can preserve the attributions while compressing a network. By employing the Weighted Collapsed Attribution Matching regularizer, we match the attribution maps of the network being compressed to its pre-compression former self.


Reviews: Frequency-Domain Dynamic Pruning for Convolutional Neural Networks

Neural Information Processing Systems

My only major issue has been addressed and the same is true for my minor questions and issues, except for (5), which I do not consider crucial, particularly given that the authors only have one page for their response. Since most of my issues were regarding question that I had or minor detail that should be added to the paper, I have raised my confidence of reproducibility to 3. ] The paper introduces a novel method for parameter-pruning in convolutional neural networks that operates in the frequency domain. The latter is a natural domain to determine parameter-importance for convolutional filters – most filters of a trained neural network are smooth and thus have high energy (i.e. An additional advantage of the method is that pruning is not performed as a single post-training step, but parameters can be pruned and re-introduced during training in a continuous fashion, which has been shown to be beneficial in previous pruning schemes. The method is evaluated on three different image classification tasks (with a separate network architecture each) and outperforms the methods it is compared against.


Shapley Pruning for Neural Network Compression

Adamczewski, Kamil, Li, Yawei, van Gool, Luc

arXiv.org Artificial Intelligence

Neural network pruning is a rich field with a variety of approaches. In this work, we propose to connect the existing pruning concepts such as leave-one-out pruning and oracle pruning and develop them into a more general Shapley value-based framework that targets the compression of convolutional neural networks. To allow for practical applications in utilizing the Shapley value, this work presents the Shapley value approximations, and performs the comparative analysis in terms of cost-benefit utility for the neural network compression. The proposed ranks are evaluated against a new benchmark, Oracle rank, constructed based on oracle sets. The broad experiments show that the proposed normative ranking and its approximations show practical results, obtaining state-of-the-art network compression.