Large Language Models Can Self-Correct with Minimal Effort

Wu, Zhenyu, Zeng, Qingkai, Zhang, Zhihan, Tan, Zhaoxuan, Shen, Chao, Jiang, Meng

Jun-23-2024–arXiv.org Artificial Intelligence

Intrinsic self-correct was a method that instructed large language models (LLMs) to verify and correct their responses without external feedback. Unfortunately, the study concluded that the LLMs could not self-correct reasoning yet. We find that a simple yet effective verification method can unleash inherent capabilities of the LLMs. That is to mask a key condition in the question, add the current response to construct a verification question, and predict the condition to verify the response. The condition can be an entity in an open-domain question or a numeric value in a math question, which requires minimal effort (via prompting) to identify. We propose an iterative verify-then-correct framework to progressively identify and correct (probably) false responses, named ProCo. We conduct experiments on three reasoning tasks. On average, ProCo, with GPT-3.5-Turbo as the backend LLM, yields $+6.8$ exact match on four open-domain question answering datasets, $+14.1$ accuracy on three arithmetic reasoning datasets, and $+9.6$ accuracy on a commonsense reasoning dataset, compared to Self-Correct.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Jun-23-2024

arXiv.org PDF

Add feedback

Country:
- Asia (1.00)
- North America > United States
  - Minnesota > Hennepin County > Minneapolis (0.14)

Genre:
- Research Report > New Finding (0.87)

Industry:
- Education > Educational Setting
  - K-12 Education (0.47)
- Government (0.93)
- Law (0.94)
- Leisure & Entertainment > Sports (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.48)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found