Microscaling Floating Point Formats for Large Language Models
Cococcioni, Marco, Pagani, Dario, Rossi, Federico
–arXiv.org Artificial Intelligence
This paper leverages microscaling floating-point formats, a novel technique designed to address these challenges by reducing the storage and computational overhead associated with numerical representations in LLMs. Unlike traditional floating-point representations that allocate a dedicated scale for each value, microscaling employs a shared scale across a block of values, enabling compact one-byte floating-point representations while maintaining an extended dynamic range. We explore the application of microscaling in the context of 8-bit floating-point formats to significantly reduce memory footprint and computational costs. We tested several configurations of microscaling floats within the GPT -2 LLM architecture, demonstrating that microscaling data formats can achieve competitive accuracy during training and inference, proving its efficacy as a resource-efficient alternative for deploying LLMs at scale. The source code is publicly available at: https://github.com/
arXiv.org Artificial Intelligence
Oct-3-2025
- Country:
- Asia > Middle East
- Republic of Türkiye (0.04)
- Europe
- Italy > Tuscany
- Pisa Province > Pisa (0.04)
- Switzerland > Geneva
- Geneva (0.04)
- Italy > Tuscany
- Asia > Middle East
- Genre:
- Research Report > Promising Solution (0.66)
- Industry:
- Automobiles & Trucks (0.46)
- Technology: