The Quantization Model of Neural Scaling Eric J. Michaud

May-23-2025, 21:22:05 GMT–Neural Information Processing Systems

We propose the Quantization Model of neural scaling laws, explaining both the observed power law dropoff of loss with model and data size, and also the sudden emergence of new capabilities with scale. We derive this model from what we call the Quantization Hypothesis, where network knowledge and skills are "quantized" into discrete chunks (quanta). We show that when quanta are learned in order of decreasing use frequency, then a power law in use frequencies explains observed power law scaling of loss.

machine learning, natural language, quanta, (15 more...)

Neural Information Processing Systems

May-23-2025, 21:22:05 GMT

Conferences PDF

Add feedback

Country:
- Europe > United Kingdom > England > Greater London > London (0.14)

Genre:
- Research Report (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (0.94)
    - Statistical Learning (1.00)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
The Quantization Model of Neural Scaling Eric J. Michaud

Similar Docs Excel Report more

Title	Similarity	Source
None found