AITopics | quantization

Collaborating Authors

quantization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Ex Uno Pluria: Insights on Ensembling in Low Precision Number Systems

Neural Information Processing SystemsJun-2-2025, 07:12:12 GMT

While ensembling deep neural networks has shown promise in improving generalization performance, scaling current ensemble methods for large models remains challenging. Given that recent progress in deep learning is largely driven by the scale, exemplified by the widespread adoption of large-scale neural network architectures, scalability emerges an increasingly critical issue for machine learning algorithms in the era of large-scale models. In this work, we first showcase the potential of low precision ensembling, where ensemble members are derived from a single model within low precision number systems in a training-free manner. Our empirical analysis demonstrates the effectiveness of our proposed low precision ensembling method compared to existing ensemble approaches.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Neural Information Processing Systems

Country: Asia > South Korea (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Clustering with Bregman Divergences: an Asymptotic Analysis

Chaoyue Liu, Mikhail Belkin

Neural Information Processing SystemsJun-2-2025, 06:34:11 GMT

Clustering, in particular k-means clustering, is a central topic in data analysis. Clustering with Bregman divergences is a recently proposed generalization of k-means clustering which has already been widely used in applications. In this paper we analyze theoretical properties of Bregman clustering when the number of the clusters k is large. We establish quantization rates and describe the limiting distribution of the centers as k, extending well-known results for k-means clustering.

artificial intelligence, bregman divergence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling

Neural Information Processing SystemsJun-2-2025, 06:18:43 GMT

Motion generation from discrete quantization offers many advantages over continuous regression, but at the cost of inevitable approximation errors. Previous methods usually quantize the entire body pose into one code, which not only faces the difficulty in encoding all joints within one vector but also loses the spatial relationship between different joints. Differently, in this work we quantize each individual joint into one vector, which i) simplifies the quantization process as the complexity associated with a single joint is markedly lower than that of the entire pose; ii) maintains a spatial-temporal structure that preserves both the spatial relationships among joints and the temporal movement patterns; iii) yields a 2D token map, which enables the application of various 2D operations widely used in 2D images. Grounded in the 2D motion quantization, we build a spatial-temporal modeling framework, where 2D joint VQVAE, temporal-spatial 2D masking technique, and spatial-temporal 2D attention are proposed to take advantage of spatial-temporal signals among the 2D tokens. Extensive experiments demonstrate that our method significantly outperforms previous methods across different datasets, with a 26.6% decrease of FID on HumanML3D and a 29.9% decrease on KIT-ML.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

bit Shampoo for Memory-Efficient Network Training

Neural Information Processing SystemsJun-2-2025, 04:58:56 GMT

Second-order optimizers, maintaining a matrix termed a preconditioner, are superior to first-order optimizers in both theory and practice. The states forming the preconditioner and its inverse root restrict the maximum size of models trained by second-order optimizers. To address this, compressing 32-bit optimizer states to lower bitwidths has shown promise in reducing memory usage.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

Asia > China (0.14)
Asia > Singapore (0.14)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Normalization Helps Training of Quantized LSTM

Lu Hou, Jinhua Zhu, James Kwok, Fei Gao, Tao Qin, Tie-Yan Liu

Neural Information Processing SystemsJun-2-2025, 00:27:59 GMT

The long-short-term memory (LSTM), though powerful, is memory and computation expensive. To alleviate this problem, one approach is to compress its weights by quantization. However, existing quantization methods usually have inferior performance when used on LSTMs. In this paper, we first show theoretically that training a quantized LSTM is difficult because quantization makes the exploding gradient problem more severe, particularly when the LSTM weight matrices are large. We then show that the popularly used weight/layer/batch normalization schemes can help stabilize the gradient magnitude in training quantized LSTMs. Empirical results show that the normalized quantized LSTMs achieve significantly better results than their unnormalized counterparts. Their performance is also comparable with the full-precision LSTM, while being much smaller in size.

artificial intelligence, machine learning, normalization, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

f8eb278a8bce873ef365b45e939da38a-AuthorFeedback.pdf

Neural Information Processing SystemsJun-2-2025, 00:27:43 GMT

Reviewer 1 "using normalization to improve the accuracy loss caused by quantization is not so impressive": Our Finally, Figure(c) shows the corresponding g values. Because of lack of space, results for the ternarized LSTM are not shown. "sequential MNIST task, the batch normalization (shared) method totally failed": In this task, each time step However, this may not be reasonable (e.g., pixels around the edge are typically darker). " quantization with scaling factors": The table above adds BWN/TWN results on sequential MNIST task (Table 4). The weight/layer normalized quantized LSTMs have comparable results as full-precision baselines, but much smaller.

artificial intelligence, lstm, normalization, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization

Neural Information Processing SystemsJun-2-2025, 00:27:08 GMT

Tremendous amount of parameters make deep neural networks impractical to be deployed for edge-device-based real-world applications due to the limit of computational power and storage space. Existing studies have made progress on learning quantized deep models to reduce model size and energy consumption, i.e. converting full-precision weights (r's) into discrete values (q's) in a supervised training manner.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Industry: Energy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

f8e59f4b2fe7c5705bf878bbd494ccdf-AuthorFeedback.pdf

Neural Information Processing SystemsJun-2-2025, 00:26:54 GMT

However, such an implementation showed roughly 10-20% drop in performance compared to the current design. MetaQuant costs 51.15 seconds to finish We will add this training time analysis in final version. MetaQuant focuses more on how to improve STE-based training quantization, without any extra loss and training tricks. MetaQuant follows dorefa using a symmetric quantization which leads to efficient inference. Regarding "... there seems to be a chicken-egg problem", the meta quantizer is actually linked to the final loss L of the Regarding "... should the loss function of the base network be used for training...", note that the goal of base network is to minimize the final prediction loss while the aim of the meta quantizer is to provide accurate gradient L/ Ŵ. Ideally, That's why STE is used to approximate the gradients in previous methods.

artificial intelligence, machine learning, meta quantizer, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

LoQT: Low-Rank Adapters for Quantized Pretraining Sebastian Loeschcke Mads Toftrup

Neural Information Processing SystemsJun-1-2025, 22:49:03 GMT

Despite advances using low-rank adapters and quantization, pretraining of large models on consumer hardware has not been possible without model sharding, offloading during training, or per-layer gradient updates. To address these limitations, we propose Low-Rank Adapters for Quantized Training (LoQT), a method for efficiently training quantized models. LoQT uses gradient-based tensor factorization to initialize low-rank trainable weight matrices that are periodically merged into quantized full-rank weight matrices. Our approach is suitable for both pretraining and fine-tuning models. We demonstrate this for language modeling and downstream task adaptation, finding that LoQT enables efficient training of models up to 7B parameters on a 24GB GPU. We also demonstrate the feasibility of training a 13B model using per-layer gradient updates on the same hardware.

large language model, machine learning, quantization, (19 more...)

Neural Information Processing Systems

Country: Europe > France (0.14)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Q-VLM: Post-training Quantization for Large Vision-Language Models

Neural Information Processing SystemsJun-1-2025, 22:11:35 GMT

In this paper, we propose a post-training quantization framework of large visionlanguage models (LVLMs) for efficient multi-modal inference. Conventional quantization methods sequentially search the layer-wise rounding functions by minimizing activation discretization errors, which fails to acquire optimal quantization strategy without considering cross-layer dependency. On the contrary, we mine the cross-layer dependency that significantly influences discretization errors of the entire vision-language model, and embed this dependency into optimal quantization strategy searching with low search cost. Specifically, we observe the strong correlation between the activation entropy and the cross-layer dependency concerning output discretization errors. Therefore, we employ the entropy as the proxy to partition blocks optimally, which aims to achieve satisfying trade-offs between discretization errors and the search cost. Moreover, we optimize the visual encoder to disentangle the cross-layer dependency for fine-grained decomposition of search space, so that the search cost is further reduced without harming the quantization accuracy. Experimental results demonstrate that our method compresses the memory by 2.78x and increase generate speed by 1.44x about 13B LLaVA model without performance degradation on diverse multi-modal reasoning tasks.

large language model, machine learning, quantization, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (0.93)

Technology: