What is Apple's Quant for Neural Networks Quantization - Analytics India Magazine

Mar-17-2021, 21:30:04 GMT–#artificialintelligence

Large Neural Networks are difficult to use in production environments as they are memory intensive and are slow during inference. Most successful Deep Learning Models such as Transformers are being followed by their Lite Versions which dramatically speed up inference trading off accuracy. In this article, let's explore Least Squares Quantization, an algorithm to speed up large neural networks by quantizing them while reducing the accuracy gap from the non-quantized model. Hadi Pouransari, Zhucheng Tu, Oncel Tuzel, researchers at Apple, introduced this approach in a paper- Least Squares Binary Quantization of Neural Networks, on 23rd March 2020. We all agree that smaller models are better for practical purposes in memory usage and inference time.

example cifar100 cifar100, optimization, quantization, (13 more...)

#artificialintelligence

Mar-17-2021, 21:30:04 GMT

News Web Page

Add feedback

Country:
- Asia > India (0.40)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)