Microscaling Floating Point Formats for Large Language Models

Cococcioni, Marco, Pagani, Dario, Rossi, Federico

Oct-3-2025–arXiv.org Artificial Intelligence

This paper leverages microscaling floating-point formats, a novel technique designed to address these challenges by reducing the storage and computational overhead associated with numerical representations in LLMs. Unlike traditional floating-point representations that allocate a dedicated scale for each value, microscaling employs a shared scale across a block of values, enabling compact one-byte floating-point representations while maintaining an extended dynamic range. We explore the application of microscaling in the context of 8-bit floating-point formats to significantly reduce memory footprint and computational costs. We tested several configurations of microscaling floats within the GPT -2 LLM architecture, demonstrating that microscaling data formats can achieve competitive accuracy during training and inference, proving its efficacy as a resource-efficient alternative for deploying LLMs at scale. The source code is publicly available at: https://github.com/

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

Oct-3-2025

arXiv.org PDF

Add feedback

Country:
- Europe (0.46)

Genre:
- Research Report > Promising Solution (0.66)

Industry:
- Automobiles & Trucks (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found