Large Language Models Can Self-Correct with Minimal Effort
Wu, Zhenyu, Zeng, Qingkai, Zhang, Zhihan, Tan, Zhaoxuan, Shen, Chao, Jiang, Meng
–arXiv.org Artificial Intelligence
Intrinsic self-correct was a method that instructed large language models (LLMs) to verify and correct their responses without external feedback. Unfortunately, the study concluded that the LLMs could not self-correct reasoning yet. We find that a simple yet effective verification method can unleash inherent capabilities of the LLMs. That is to mask a key condition in the question, add the current response to construct a verification question, and predict the condition to verify the response. The condition can be an entity in an open-domain question or a numeric value in a math question, which requires minimal effort (via prompting) to identify. We propose an iterative verify-then-correct framework to progressively identify and correct (probably) false responses, named ProCo. We conduct experiments on three reasoning tasks. On average, ProCo, with GPT-3.5-Turbo as the backend LLM, yields $+6.8$ exact match on four open-domain question answering datasets, $+14.1$ accuracy on three arithmetic reasoning datasets, and $+9.6$ accuracy on a commonsense reasoning dataset, compared to Self-Correct.
arXiv.org Artificial Intelligence
Jun-23-2024
- Country:
- Asia
- China > Shaanxi Province
- Xi'an (0.04)
- India (0.29)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Singapore (0.04)
- China > Shaanxi Province
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- United Kingdom > England
- Greater London > London
- Wimbledon (0.05)
- Oxfordshire > Oxford (0.04)
- Greater London > London
- Belgium > Brussels-Capital Region
- North America
- Canada > Ontario
- Toronto (0.04)
- Dominican Republic (0.04)
- United States
- Alabama (0.04)
- Idaho (0.05)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Montana (0.05)
- Virginia (0.05)
- Washington > King County
- Seattle (0.04)
- Canada > Ontario
- Asia
- Genre:
- Research Report > New Finding (0.87)
- Industry:
- Education > Educational Setting
- K-12 Education (0.47)
- Government (0.93)
- Law (0.94)
- Leisure & Entertainment > Sports (1.00)
- Education > Educational Setting
- Technology: