ParetoQ: Improving Scaling Laws in Extremely Low-bit LLMQuantization

Jun-19-2026, 05:12:24 GMT–Neural Information Processing Systems

The optimal bit-width for achieving the best trade-off between quantized model size and accuracy has been a subject of ongoing debate. While some advocate for 4-bit quantization, others propose that 1.58-bit offers superior results. However, the lack of a cohesive framework for different bits has left such conclusions relatively tenuous.

large language model, machine learning, quantization, (20 more...)

Neural Information Processing Systems

Jun-19-2026, 05:12:24 GMT

Conferences PDF

Add feedback

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.99)
  - Representation & Reasoning > Commonsense Reasoning (0.68)
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found