AITopics | low-bit weight

Collaborating Authors

low-bit weight

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Searching for Low-Bit Weights in Quantized Neural Networks Zhaohui Y ang 1, 2, Yunhe Wang

Neural Information Processing SystemsFeb-7-2026, 22:14:11 GMT

As can be seen from the figure, the weights converge gradually.

artificial intelligence, low-bit weight, machine learning, (7 more...)

Neural Information Processing Systems

Country: North America > Canada (0.09)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

Searching for Low-Bit Weights in Quantized Neural Networks

Neural Information Processing SystemsDec-23-2025, 21:37:59 GMT

Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators. However, the quantization functions used in most conventional quantization methods are non-differentiable, which increases the optimization difficulty of quantized networks. Compared with full-precision parameters (\emph{i.e.}, 32-bit floating numbers), low-bit values are selected from a much smaller set. For example, there are only 16 possibilities in 4-bit space. Thus, we present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately. In particular, each weight is represented as a probability distribution over the discrete value set. The probabilities are optimized during training and the values with the highest probability are selected to establish the desired quantized network. Experimental results on benchmarks demonstrate that the proposed method is able to produce quantized neural networks with higher performance over the state-of-the-arts on both image classification and super-resolution tasks.

low-bit weight, name change, quantized neural network, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.78)

Add feedback

Searching for Low-Bit Weights in Quantized Neural Networks Zhaohui Y ang 1, 2, Yunhe Wang

Neural Information Processing SystemsOct-2-2025, 13:18:46 GMT

As can be seen from the figure, the weights converge gradually.

artificial intelligence, low-bit weight, machine learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

2a084e55c87b1ebcdaad1f62fdbbac8e-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 13:18:28 GMT

We sincerely thank four reviewers for the valuable comments. The ablation study issue is concerned by Reviewer #2, #4. We answer these issues first. We have already conducted the ablation study on the temperature in Sec. We fill fix the typos in the updated version and proofread the paper to make it more readible.

accuracy, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.49)

Add feedback

QuEST: Stable Training of LLMs with 1-Bit Weights and Activations

Panferov, Andrei, Chen, Jiale, Tabesh, Soroush, Castro, Roberto L., Nikdan, Mahdi, Alistarh, Dan

arXiv.org Artificial IntelligenceFeb-7-2025

One approach to reducing the massive costs of large language models (LLMs) is the use of quantized or sparse representations for training or deployment. While post-training compression methods are very popular, the question of obtaining even more accurate compressed models by directly training over such representations, i.e., Quantization-Aware Training (QAT), is still open: for example, a recent study (arXiv:2411.04330v2) put the "optimal" bit-width at which models can be trained using QAT, while staying accuracy-competitive with standard FP16/BF16 precision, at 8-bits weights and activations. We advance this state-of-the-art via a new method called QuEST, which is Pareto-competitive with FP16, i.e., it provides better accuracy at lower model size, while training models with weights and activations in 4-bits or less. Moreover, QuEST allows stable training with 1-bit weights and activations. QuEST achieves this by improving two key aspects of QAT methods: (1) accurate and fast quantization of the (continuous) distributions of weights and activations via Hadamard normalization and MSE-optimal fitting; (2) a new trust gradient estimator based on the idea of explicitly minimizing the error between the noisy gradient computed over quantized states and the "true" (but unknown) full-precision gradient. Experiments on Llama-type architectures show that QuEST induces stable scaling laws across the entire range of hardware-supported precisions, and can be extended to sparse representations. We provide GPU kernel support showing that models produced by QuEST can be executed efficiently. Our code is available at https://github.com/IST-DASLab/QuEST.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.05003

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Review for NeurIPS paper: Searching for Low-Bit Weights in Quantized Neural Networks

Neural Information Processing SystemsJan-22-2025, 23:03:29 GMT

Weaknesses: 1) The similar idea of learning an auxiliary differentiable network has also been introduced in the following paper. The main difference of this paper to the following reference is that multiple bits are learned for each code in this paper while, undoubtedly, binary weights and representations will be more cost-efficient. More importantly, authors did not discuss this similar reference. IJCAI, 2019 2) I am very confused with the EQ. According to EQ. (1), The values v are discrete numbers while p is probability that the elements in W belong to the i -th discrete value.

low-bit weight, neurips paper, quantized neural network, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.43)

Add feedback

Review for NeurIPS paper: Searching for Low-Bit Weights in Quantized Neural Networks

Neural Information Processing SystemsJan-22-2025, 23:03:22 GMT

The paper proposes a novel end-to-end gradient-based optimization for searching discrete low-bit weights in quantized networks. After reading the reviews, rebuttal, and the discussion among reviewers the paper clearly is recognized as novel and well executed. I would encourage the authors to further improve their work by better clarifying the decay strategy for the temperature in the camera ready and to add a comparison with SGD-R scheduling as pointed out by one of the reviewers. It would be also nice to have a mention on how the proposed approach relates to Latent Weights Do Not Exist: Rethinking Binarized Neural.

low-bit weight, neurips paper, quantized neural network, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Searching for Low-Bit Weights in Quantized Neural Networks

Neural Information Processing SystemsOct-9-2024, 20:13:05 GMT

low-bit weight, quantized neural network, searching, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Add feedback