net-trim
Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee
We introduce and analyze a new technique for model reduction for deep neural networks. While large networks are theoretically capable of learning arbitrarily complex models, overfitting and model redundancy negatively affects the prediction accuracy and model variance. Our Net-Trim algorithm prunes (sparsifies) a trained network layer-wise, removing connections at each layer by solving a convex optimization program. This program seeks a sparse set of weights at each layer that keeps the layer inputs and outputs consistent with the originally trained model. The algorithms and associated analysis are applicable to neural networks operating with the rectified linear unit (ReLU) as the nonlinear activation.
- Asia > Singapore (0.05)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
Reviews: Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon
Summary: This paper adapts Optimal Brain Surgeon (OBS) method to a local version, and modified the objective function to be the target activation per each layer. Similar to OBS, it uses an approximation to compute Hessian inverse by running through the dataset once. Compare to prior methods, it finishes compression with much less retraining iterations. A theoretical bound on the total error based on local reconstruction error is provided. Pros: - The paper explores a local version of OBS and shows effectiveness of proposed method in terms of less time cost for retraining the pruned network.
Reviews: Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee
The paper presents a technique to sparsify a deep ReLU neural network by solving a sequence of convex problems at each layer. The convex problem finds the sparsest set of weights that approximates the mapping from one layer to another. The ReLU nonlinearity is dealt with by treating the activated and deactivated cases as two separate sets of constraints in the optimization problem, thus, bringing convexity. Two variants are provided, one that considers each layer separately, and another that carries the approximation in the next layer optimization problem to give the chance to the next layer to counterbalance this error. In both cases, the authors provide bounds on the approximation error after sparsification.
Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon
Xin Dong, Shangyu Chen, Sinno Pan
How to develop slim and accurate deep neural networks has become crucial for realworld applications, especially for those employed in embedded systems. Though previous work along this research line has shown some promising results, most existing methods either fail to significantly compress a well-trained deep network or require a heavy retraining process for the pruned deep network to re-boost its prediction performance. In this paper, we propose a new layer-wise pruning method for deep neural networks. In our proposed method, parameters of each individual layer are pruned independently based on second order derivatives of a layer-wise error function with respect to the corresponding parameters. We prove that the final prediction performance drop after pruning is bounded by a linear combination of the reconstructed errors caused at each layer. By controlling layer-wise errors properly, one only needs to perform a light retraining process on the pruned network to resume its original prediction performance. We conduct extensive experiments on benchmark datasets to demonstrate the effectiveness of our pruning method compared with several state-of-the-art baseline methods. Codes of our work are released at: https://github.com/csyhhu/L-OBS.
- Asia > Singapore (0.05)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
3fab5890d8113d0b5a4178201dc842ad-Paper.pdf
We introduce and analyze a new technique for model reduction for deep neural networks. While large networks are theoretically capable of learning arbitrarily complex models, overfitting and model redundancy negatively affects the prediction accuracy and model variance. Our Net-Trim algorithm prunes (sparsifies) a trained network layer-wise, removing connections at each layer by solving a convex optimization program. This program seeks a sparse set of weights at each layer that keeps the layer inputs and outputs consistent with the originally trained model. The algorithms and associated analysis are applicable to neural networks operating with the rectified linear unit (ReLU) as the nonlinear activation.
Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee
Aghasi, Alireza, Abdi, Afshin, Nguyen, Nam, Romberg, Justin
We introduce and analyze a new technique for model reduction for deep neural networks. While large networks are theoretically capable of learning arbitrarily complex models, overfitting and model redundancy negatively affects the prediction accuracy and model variance. Our Net-Trim algorithm prunes (sparsifies) a trained network layer-wise, removing connections at each layer by solving a convex optimization program. This program seeks a sparse set of weights at each layer that keeps the layer inputs and outputs consistent with the originally trained model. The algorithms and associated analysis are applicable to neural networks operating with the rectified linear unit (ReLU) as the nonlinear activation.
Fast Convex Pruning of Deep Neural Networks
Aghasi, Alireza, Abdi, Afshin, Romberg, Justin
We develop a fast, tractable technique called Net-Trim for simplifying a trained neural network. The method is a convex post-processing module, which prunes (sparsifies) a trained network layer by layer, while preserving the internal responses. We present a comprehensive analysis of Net-Trim from both the algorithmic and sample complexity standpoints, centered on a fast, scalable convex optimization program. Our analysis includes consistency results between the initial and retrained models before and after Net-Trim application and guarantees on the number of training samples needed to discover a network that can be expressed using a certain number of nonzero terms. Specifically, if there is a set of weights that uses at most $s$ terms that can re-create the layer outputs from the layer inputs, we can find these weights from $\mathcal{O}(s\log N/s)$ samples, where $N$ is the input size. These theoretical results are similar to those for sparse regression using the Lasso, and our analysis uses some of the same recently-developed tools (namely recent results on the concentration of measure and convex analysis). Finally, we propose an algorithmic framework based on the alternating direction method of multipliers (ADMM), which allows a fast and simple implementation of Net-Trim for network pruning and compression.
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon
Dong, Xin, Chen, Shangyu, Pan, Sinno
How to develop slim and accurate deep neural networks has become crucial for real- world applications, especially for those employed in embedded systems. Though previous work along this research line has shown some promising results, most existing methods either fail to significantly compress a well-trained deep network or require a heavy retraining process for the pruned deep network to re-boost its prediction performance. In this paper, we propose a new layer-wise pruning method for deep neural networks. In our proposed method, parameters of each individual layer are pruned independently based on second order derivatives of a layer-wise error function with respect to the corresponding parameters. We prove that the final prediction performance drop after pruning is bounded by a linear combination of the reconstructed errors caused at each layer. By controlling layer-wise errors properly, one only needs to perform a light retraining process on the pruned network to resume its original prediction performance. We conduct extensive experiments on benchmark datasets to demonstrate the effectiveness of our pruning method compared with several state-of-the-art baseline methods. Codes of our work are released at: https://github.com/csyhhu/L-OBS.
- Asia > Singapore (0.05)
- North America > United States > California > Los Angeles County > Long Beach (0.04)