AITopics | Csató, Lehel

Pruning CNN's with linear filter ensembles

Sándor, Csanád, Pável, Szabolcs, Csató, Lehel

arXiv.org Machine LearningJan-22-2020

Despite the promising results of convolutional neural networks (CNNs), applying them on resource limited devices is still a challenge, mainly due to the huge memory and computation requirements. To tackle these problems, pruning can be applied to reduce the network size and number of floating point operations (FLOPs). Contrary to the \emph{filter norm} method -- that is used in network pruning and uses the assumption that the smaller this norm, the less important is the associated component --, we develop a novel filter importance norm that incorporates the loss caused by the elimination of a component from the CNN. To estimate the importance of a set of architectural components, we measure the CNN performance as different components are removed. The result is a collection of filter ensembles -- filter masks -- and associated performance values. We rank the filters based on a linear and additive model and remove the least important ones such that the drop in network accuracy is minimal. We evaluate our method on a fully connected network, as well as on the ResNet architecture trained on the CIFAR-10 data-set. Using our pruning method, we managed to remove $60\%$ of the parameters and $64\%$ of the FLOPs from the ResNet with an accuracy drop of less than $0.6\%$.

deep learning, neural network, pruning, (19 more...)

arXiv.org Machine Learning

2001.08142

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

TAP Gibbs Free Energy, Belief Propagation and Sparsity

Csató, Lehel, Opper, Manfred, Winther, Ole

Neural Information Processing SystemsDec-31-2002

The adaptive TAP Gibbs free energy for a general densely connected probabilistic model with quadratic interactions and arbritary single site constraints is derived. We show how a specific sequential minimization of the free energy leads to a generalization of Minka's expectation propagation. Lastly, we derive a sparse representation version of the sequential algorithm. The usefulness of the approach is demonstrated on classification and density estimation with Gaussian processes and on an independent component analysis problem.

approximation, artificial intelligence, bayesian inference, (16 more...)

Neural Information Processing Systems

Country: Europe > Denmark (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

TAP Gibbs Free Energy, Belief Propagation and Sparsity

Csató, Lehel, Opper, Manfred, Winther, Ole

Neural Information Processing SystemsDec-31-2002

The adaptive TAP Gibbs free energy for a general densely connected probabilistic model with quadratic interactions and arbritary single site constraints is derived. We show how a specific sequential minimization of the free energy leads to a generalization of Minka's expectation propagation. Lastly,we derive a sparse representation version of the sequential algorithm. The usefulness of the approach is demonstrated on classification anddensity estimation with Gaussian processes and on an independent componentanalysis problem.

approximation, artificial intelligence, bayesian inference, (16 more...)

Neural Information Processing Systems

Country: Europe > Denmark (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Sparse Representation for Gaussian Process Models

Csató, Lehel, Opper, Manfred

Neural Information Processing SystemsDec-31-2001

We develop an approach for a sparse representation for Gaussian Process (GP) models in order to overcome the limitations of GPs caused by large data sets.

artificial intelligence, bayesian inference, vector, (18 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.28)

Technology:

Information Technology > Modeling & Simulation (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Sparse Representation for Gaussian Process Models

Csató, Lehel, Opper, Manfred

Neural Information Processing SystemsDec-31-2001

We develop an approach for a sparse representation for Gaussian Process (GP) models in order to overcome the limitations of GPs caused by large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the model. Experimental results on toy examples and large real-world data sets indicate the efficiency of the approach.

Add feedback

Efficient Approaches to Gaussian Process Classification

Csató, Lehel, Fokoué, Ernest, Opper, Manfred, Schottky, Bernhard, Winther, Ole

Neural Information Processing SystemsDec-31-2000

The first two methods are related to mean field ideas known in Statistical Physics. The third approach is based on Bayesian online approach which was motivated by recent results in the Statistical Mechanics of Neural Networks. We present simulation results showing: 1. that the mean field Bayesian evidence may be used for hyperparameter tuning and 2. that the online approach may achieve a low training error fast. 1 Introduction Gaussian processes provide promising nonparametric Bayesian approaches to regression andclassification [2, 1].

approximation, artificial intelligence, bayesian inference, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.15)
Europe > Sweden (0.14)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Efficient Approaches to Gaussian Process Classification

Csató, Lehel, Fokoué, Ernest, Opper, Manfred, Schottky, Bernhard, Winther, Ole

Neural Information Processing SystemsDec-31-2000

The first two methods are related to mean field ideas known in Statistical Physics. The third approach is based on Bayesian online approach which was motivated by recent results in the Statistical Mechanics of Neural Networks. We present simulation results showing: 1. that the mean field Bayesian evidence may be used for hyperparameter tuning and 2. that the online approach may achieve a low training error fast. 1 Introduction Gaussian processes provide promising nonparametric Bayesian approaches to regression and classification [2, 1].

approximation, artificial intelligence, bayesian inference, (17 more...)

Neural Information Processing Systems

Country: