Position-based Scaled Gradient for Model Quantization and Pruning - Appendix

Aug-22-2025, 01:03:05 GMT–Neural Information Processing Systems

In this experiment, we only quantize the weights, not the activations, to compare the performance degradation as weight bit-width decreases. The mean squared errors (MSE) of the weights across different bit-widths are also reported. In Fig. A1, we display the full-precision weight distributions of the PSGD models and compare them Four random layers of each model are shown column-wise. The first row displays the model trained with SGD and L2 weight decay. This is also reported in Figure 1 of the original paper.

epoch, experiment, weight decay, (13 more...)

Neural Information Processing Systems

Aug-22-2025, 01:03:05 GMT

Conferences PDF

Add feedback

Country:
- North America > Canada (0.04)
- Europe > Netherlands
  - North Holland > Amsterdam (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - South Korea > Seoul
    - Seoul (0.06)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Duplicate Docs Excel Report

Title
Position-basedScaledGradientforModel QuantizationandPruning-Appendix

Similar Docs Excel Report more

Title	Similarity	Source
None found