quantization method
HitNet: Hybrid Ternary Recurrent Neural Network
Quantization is a promising technique to reduce the model size, memory footprint, and massive computation operations of recurrent neural networks (RNNs) for embedded devices with limited resources. Although extreme low-bit quantization has achieved impressive success on convolutional neural networks, it still suffers from huge accuracy degradation on RNNs with the same low-bit precision. In this paper, we first investigate the accuracy degradation on RNN models under different quantization schemes, and the distribution of tensor values in the full precision model. Our observation reveals that due to the difference between the distributions of weights and activations, different quantization methods are suitable for different parts of models. Based on our observation, we propose HitNet, a hybrid ternary recurrent neural network, which bridges the accuracy gap between the full precision model and the quantized model. In HitNet, we develop a hybrid quantization method to quantize weights and activations. Moreover, we introduce a sloping factor motivated by prior work on Boltzmann machine to activation functions, further closing the accuracy gap between the full precision model and the quantized model.
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
- Asia > Singapore (0.05)
- North America > Canada (0.04)
- North America > United States > Arizona (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > China > Anhui Province > Hefei (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
- Information Technology > Security & Privacy (0.93)
- Government (0.68)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)