LLMs cannot find reasoning errors, but can correct them!
Tyen, Gladys, Mansoor, Hassan, Cărbune, Victor, Chen, Peter, Mak, Tony
–arXiv.org Artificial Intelligence
While self-correction has shown promise in improving LLM outputs in terms of style and quality (e.g. Chen et al., 2023; Madaan et al., 2023), recent attempts to self-correct logical or reasoning errors often cause correct answers to become incorrect, resulting in worse performances overall (Huang et al., 2023). In this paper, we break down the self-correction process into two core components: mistake finding and output correction. For mistake finding, we release BIG-Bench Mistake, a dataset of logical mistakes in Chain-of-Thought reasoning traces. We provide benchmark numbers for several state-of-the-art LLMs, and demonstrate that LLMs generally struggle with finding logical mistakes. For output correction, we propose a backtracking method which provides large improvements when given information on mistake location. We construe backtracking as a lightweight alternative to reinforcement learning methods, and show that it remains effective with a reward model at 60-70% accuracy.
arXiv.org Artificial Intelligence
Jan-8-2024
- Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Genre:
- Research Report > New Finding (0.46)
- Technology: