KwaiYiiMath: Technical Report
Fu, Jiayi, Lin, Lei, Gao, Xiaoyang, Liu, Pengli, Chen, Zhengzong, Yang, Zhirui, Zhang, Shengnan, Zheng, Xue, Li, Yan, Liu, Yuliang, Ye, Xucheng, Liao, Yiqiao, Liao, Chao, Chen, Bin, Song, Chengru, Wan, Junchen, Lin, Zijia, Zhang, Fuzheng, Wang, Zhongyuan, Zhang, Di, Gai, Kun
–arXiv.org Artificial Intelligence
Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning. Meanwhile, we also constructed a small-scale Chinese primary school mathematics test set (named KMath), consisting of 188 examples to evaluate the correctness of the problem-solving process generated by the models. Empirical studies demonstrate that KwaiYiiMath can achieve stateof-the-art (SOTA) performance on GSM8k, CMath, and KMath compared with the similar size models, respectively. Recent advances in large language models (LLMs) have revolutionized the natural language processing (NLP) landscape Kenton & Toutanova (2019); Brown et al. (2020), where scaling up model size and the amount of data is one of the key ingredients Rae et al. (2021); Chowdhery et al. (2022); Anil et al. (2023); Touvron et al. (2023a;b). Surprisingly, recent progress suggests that LLMs also have the potential to solve reasoning problems Clark et al. (2020); Talmor et al. (2020); Suzgun et al. (2022); Wei et al. (2022b). In this report, we focus on how to enhance the mathematical reasoning capabilities of LLM through an alignment process that includes supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). Specifically, we introduce the KwaiYiiMath which is finetuned with human alignment techniques from KwaiYiiBase to tackle mathematical problems. Experimental results show that KwaiYiiMath outperforms many open-source models in similar sizes by a large margin and is approaching GPT-4 on three mathematical benchmarks including both English and Chinese, i.e., GSM8k Cobbe et al. (2021), CMath Wei et al. (2023), and a small-scale in-house dataset KMath. KwaiYiiBase is a large language model developed by Kuaishou https://github.com/kwai/KwaiYii/. Section 3 introduces the methodology of KwaiYiiMath including the process of supervised fine-tuning and human preference alignment. Additionally, it also describes details about the efforts in collecting large amounts of mathematical high-quality training data.
arXiv.org Artificial Intelligence
Oct-19-2023
- Country:
- Asia
- Indonesia > Bali (0.04)
- Middle East > Jordan (0.04)
- Asia
- Genre:
- Research Report > New Finding (0.34)
- Industry:
- Education
- Curriculum > Subject-Specific Education (0.46)
- Educational Setting > K-12 Education (0.48)
- Education
- Technology: