Training Deep Neural Networks with 8-bit Floating Point Numbers

Naigang Wang, Jungwook Choi, Daniel Brand, Chia-Yu Chen, Kailash Gopalakrishnan

Feb-19-2026, 16:35:04 GMT–Neural Information Processing Systems

Firstly,when all the operands (i.e., weights, activations, errors and gradients) for general matrix multiplication (GEMM) and convolution computations are reduced to 8 bits, most DNNs suffer noticeable accuracy degradation (e.g., Figure 1(a)).

accumulation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Feb-19-2026, 16:35:04 GMT

Conferences PDF

Add feedback

Country:
- North America
  - United States (0.04)
  - Canada > Quebec
    - Montreal (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Duplicate Docs Excel Report

Title
Training Deep Neural Networks with 8-bit Floating Point Numbers
Training Deep Neural Networks with 8-bit Floating Point Numbers

Similar Docs Excel Report more

Title	Similarity	Source
None found