Searching for Low-Bit Weights in Quantized Neural Networks

Oct-9-2024, 20:13:05 GMT–Neural Information Processing Systems

Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators. However, the quantization functions used in most conventional quantization methods are non-differentiable, which increases the optimization difficulty of quantized networks. Compared with full-precision parameters (\emph{i.e.}, 32-bit floating numbers), low-bit values are selected from a much smaller set. For example, there are only 16 possibilities in 4-bit space. Thus, we present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately.

low-bit weight, quantized neural network, searching, (2 more...)

Neural Information Processing Systems

Oct-9-2024, 20:13:05 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)