A Expected quantization error computation The expected quantization error is a sum of two terms, the rounding error E

Neural Information Processing Systems 

As we noted in section 6, quantized tensors naturally have some sparsity. As we can see, the sparsity values become very significant, especially for low bit-width values. The results are given in figure 7. We pruned and quantized single layers of Resnet-18 and plotted We quantized and pruned all the PyTorch model zoo weights tensors. To reduce the computational complexity of finding the global solution for pruning, the layers had to be split into chunks. In section, we provide details on per-layer experiments we performed in section 4. In table 3 we In table 4 we provide details of the full-model experiments.