Neural Network Quantization for Efficient Inference: A Survey