InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models

Wang, Wenjun, Cai, Shuo, Xie, Congkai, Feng, Mingfa, Zhang, Yiming, Li, Zhen, Yang, Kejing, Li, Ming, Cao, Jiannong, Yang, Hongxia

Oct-20-2025–arXiv.org Artificial Intelligence

The immense computational cost of training Large Language Models (LLMs) presents a major barrier to innovation. While FP8 training offers a promising solution with significant theoretical efficiency gains, its widespread adoption has been hindered by the lack of a comprehensive, open-source training recipe. To bridge this gap, we introduce an end-to-end FP8 training recipe that seamlessly integrates continual pre-training and supervised fine-tuning. Our methodology employs a fine-grained, hybrid-granularity quantization strategy to maintain numerical fidelity while maximizing computational efficiency. Through extensive experiments, including the continue pre-training of models on a 160B-token corpus, we demonstrate that our recipe is not only remarkably stable but also essentially lossless, achieving performance on par with the BF16 baseline across a suite of reasoning benchmarks. Crucially, this is achieved with substantial efficiency improvements, including up to a 22% reduction in training time, a 14% decrease in peak memory usage, and a 19% increase in throughput. Our results establish FP8 as a practical and robust alternative to BF16, and we will release the accompanying code to further democratize large-scale model training.

large language model, machine learning, quantization, (16 more...)

arXiv.org Artificial Intelligence

Oct-20-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.88)

Industry:
- Education (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found