Frame Quantization of Neural Networks

Czaja, Wojciech, Na, Sanghoon

arXiv.org Machine Learning 

Quantization is the process of compressing input from a continuous or large set of values into a small-sized discrete set. It gained popularity in signal processing, where one of its primary goals is obtaining a condensed representation of the analogue signal suitable for digital storage and recovery. Examples of quantization algorithms include truncated binary expansion, pulse-code modulation (PCM) and sigma-delta (Σ) quantization. Among them, Σ algorithms stand out due to their theoretically guaranteed robustness. Mathematical theories were developed in several seminal works [3-5, 8, 11], and have been carefully studied since, e.g., [14, 15, 19, 27]. In recent years, the concept of quantization also captured the attention of the machine learning community. The quantization of deep neural networks (DNNs) is considered one of the most effective network compression techniques [9]. Computers express parameters of a neural network as 32-bit or 64-bit floating point numbers.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found