KwaiYiiMath: Technical Report

Fu, Jiayi, Lin, Lei, Gao, Xiaoyang, Liu, Pengli, Chen, Zhengzong, Yang, Zhirui, Zhang, Shengnan, Zheng, Xue, Li, Yan, Liu, Yuliang, Ye, Xucheng, Liao, Yiqiao, Liao, Chao, Chen, Bin, Song, Chengru, Wan, Junchen, Lin, Zijia, Zhang, Fuzheng, Wang, Zhongyuan, Zhang, Di, Gai, Kun

Oct-19-2023–arXiv.org Artificial Intelligence

Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning. Meanwhile, we also constructed a small-scale Chinese primary school mathematics test set (named KMath), consisting of 188 examples to evaluate the correctness of the problem-solving process generated by the models. Empirical studies demonstrate that KwaiYiiMath can achieve stateof-the-art (SOTA) performance on GSM8k, CMath, and KMath compared with the similar size models, respectively. Recent advances in large language models (LLMs) have revolutionized the natural language processing (NLP) landscape Kenton & Toutanova (2019); Brown et al. (2020), where scaling up model size and the amount of data is one of the key ingredients Rae et al. (2021); Chowdhery et al. (2022); Anil et al. (2023); Touvron et al. (2023a;b). Surprisingly, recent progress suggests that LLMs also have the potential to solve reasoning problems Clark et al. (2020); Talmor et al. (2020); Suzgun et al. (2022); Wei et al. (2022b). In this report, we focus on how to enhance the mathematical reasoning capabilities of LLM through an alignment process that includes supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). Specifically, we introduce the KwaiYiiMath which is finetuned with human alignment techniques from KwaiYiiBase to tackle mathematical problems. Experimental results show that KwaiYiiMath outperforms many open-source models in similar sizes by a large margin and is approaching GPT-4 on three mathematical benchmarks including both English and Chinese, i.e., GSM8k Cobbe et al. (2021), CMath Wei et al. (2023), and a small-scale in-house dataset KMath. KwaiYiiBase is a large language model developed by Kuaishou https://github.com/kwai/KwaiYii/. Section 3 introduces the methodology of KwaiYiiMath including the process of supervised fine-tuning and human preference alignment. Additionally, it also describes details about the efforts in collecting large amounts of mathematical high-quality training data.

arxiv preprint arxiv, kwaiyiimath, language model, (14 more...)

arXiv.org Artificial Intelligence

Oct-19-2023

arXiv.org PDF

Add feedback

Country:
- Asia
  - Middle East > Jordan (0.04)
  - Indonesia > Bali (0.04)

Genre:
- Research Report > New Finding (0.34)

Industry:
- Education
  - Educational Setting > K-12 Education (0.48)
  - Curriculum > Subject-Specific Education (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found