With Shared Microexponents, A Little Shifting Goes a Long Way

Rouhani, Bita, Zhao, Ritchie, Elango, Venmugil, Shafipour, Rasoul, Hall, Mathew, Mesmakhosroshahi, Maral, More, Ankit, Melnick, Levi, Golub, Maximilian, Varatkar, Girish, Shao, Lei, Kolhe, Gaurav, Melts, Dimitry, Klar, Jasmine, L'Heureux, Renee, Perry, Matt, Burger, Doug, Chung, Eric, Deng, Zhaoxia, Naghshineh, Sam, Park, Jongsoo, Naumov, Maxim

Apr-12-2023–arXiv.org Artificial Intelligence

This paper introduces Block Data Representations (BDR), a framework for exploring and evaluating a wide spectrum of narrow-precision formats for deep learning. It enables comparison of popular quantization standards, and through BDR, new formats based on shared microexponents (MX) are identified, which outperform other state-of-the-art quantization approaches, including narrow-precision floating-point and block floating-point. MX utilizes multiple levels of quantization scaling with ultra-fine scaling factors based on shared microexponents in the hardware. The effectiveness of MX is demonstrated on real-world models including large-scale generative pretraining and inferencing, and production-scale recommendation systems.

data format, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Apr-12-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.46)

Industry:
- Energy > Oil & Gas (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found