MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization

May-27-2025, 10:17:22 GMT–Neural Information Processing Systems

In this paper, we present a simple optimization-based preprocessing technique called Weight Magnitude Reduction (MagR) to improve the performance of post-training quantization. For each linear layer, we adjust the pre-trained floating-point weights by solving an \ell_\infty -regularized optimization problem. This process greatly diminishes the maximum magnitude of the weights and smooths out outliers, while preserving the layer's output. The preprocessed weights are centered more towards zero, which facilitates the subsequent quantization process. To implement MagR, we address the \ell_\infty -regularization by employing an efficient proximal gradient descent algorithm.

enhancing post-training quantization, magr, weight magnitude reduction, (1 more...)

Neural Information Processing Systems

May-27-2025, 10:17:22 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.62)