LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

Gao, Bofei, Cai, Zefan, Xu, Runxin, Wang, Peiyi, Zheng, Ce, Lin, Runji, Lu, Keming, Liu, Dayiheng, Zhou, Chang, Xiao, Wen, Hu, Junjie, Liu, Tianyu, Chang, Baobao

Jul-8-2024–arXiv.org Artificial Intelligence

Mathematical verfier achieves success in mathematical reasoning tasks by validating the correctness of solutions. However, existing verifiers are trained with binary classification labels, which are not informative enough for the model to accurately assess the solutions. To mitigate the aforementioned insufficiency of binary labels, we introduce step-wise natural language feedbacks as rationale labels (i.e., the correctness of the current step and the explanations). In this paper, we propose \textbf{Math-Minos}, a natural language feedback enhanced verifier by constructing automatically-generated training data and a two-stage training paradigm for effective training and efficient inference. Our experiments reveal that a small set (30k) of natural language feedbacks can significantly boost the performance of the verifier by the accuracy of 1.6\% (86.6\% $\rightarrow$ 88.2\%) on GSM8K and 0.8\% (37.8\% $\rightarrow$ 38.6\%) on MATH. We have released our code and data for further exploration.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Jul-8-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada (0.14)
  - United States > Wisconsin (0.14)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.48)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found