Distribution-Aware Tensor Decomposition for Compression of Convolutional Neural Networks
–Neural Information Processing Systems
Neural networks are widely used for image-related tasks but typically demand considerable computing power. Once a network has been trained, however, its memory and compute footprint can be reduced by compression. In this work, we focus on compression through tensorization and low rank representations. Whereas classical approaches search for a low rank approximation by minimizing an isotropic norm such as the Frobenius norm in weight space, we use data informed norms that measure the error in function space. Concretely, we minimize the change in the layer's output distribution, which can be expressed as $\lVert (W - \widetilde{W}) \Sigma^{1/2}\rVert_F$ where $\Sigma^{1/2}$ is the square root of the covariance matrix of the layer's input and $W$, $\widetilde{W}$ are the original and compressed weights. We propose new alternating least square algorithms for the two most common tensor decompositions (Tucker 2 and CPD) that directly optimize the new norm. Unlike conventional compression pipelines, which almost always require post compression fine tuning, our data informed approach often achieves competitive accuracy without any fine tuning. We further show that the same covariance based norm can be transferred from one dataset to another with only a minor accuracy drop, enabling compression even when the original training dataset is unavailable. Experiments on several CNN architectures (ResNet 18/50, and GoogLeNet) and datasets (ImageNet, FGVC Aircraft, Cifar10, and Cifar100) confirm the advantages of the proposed method.
Neural Information Processing Systems
Jun-13-2026, 00:05:59 GMT
- Technology: