Enhancing Mathematical Reasoning in Large Language Models with Self-Consistency-Based Hallucination Detection

Liu, MingShan, Fang, Jialing

arXiv.org Artificial Intelligence 

Large language models (LLMs) have demonstrated impressive mathematical reasoning capabilities but remain susceptible to hallucinations--plausible yet incorrect statements--particularly in complex domains requiring rigorous logical deduction. Current approaches to improve reliability often neglect the logical consistency of intermediate reasoning steps, focusing primarily on final answer verification. We propose a structured self-consistency (SC) framework that systematically evaluates factual concordance across both intermediate reasoning steps and final outputs, thereby creating a hierarchical verification mechanism for mathematical reasoning. Our framework employs a probabilistic formulation that quantifies consistency through ensemble agreement, entropy minimization, and structural isomorphism detection in reasoning graphs. We evaluate our approach on three fundamental mathematical tasks: formal theorem proving, symbolic transformation, and numerical computation. Experimental results demonstrate that our method achieves significant improvements over baseline approaches: proof validity increases by 8.3% ( p < 0. 01), symbolic reasoning accuracy by 9.6%, and numerical stability by 42.8% while reducing computational overhead by 56.3%. Further analysis reveals that our structured SC framework exhibits strong correlation with human expert evaluation ( ρ = 0 .87),

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found