Low-bit Quantization of Neural Networks for Efficient Inference

Open in new window