ATheoretical Study on Bridging Internal Probability and Self-Consistency for LLMReasoning
–Neural Information Processing Systems
Test-time scaling seeks to improve the reasoning performance of large language models (LLMs) by adding computational resources. A prevalent approach within the field is sampling-based test-time scaling methods, which enhance reasoning by generating multiple reasoning paths for a given input during inference. However, despite its practical success, the theoretical foundations remain underexplored. In this paper, we provide the first theoretical framework for analyzing sampling-based test-time scaling methods, grounded in the perspective of confidence estimation. Based on the framework, we analyze two dominant paradigms: self-consistency and perplexity, and reveal key limitations: self-consistency suffers from high estimation error while perplexity exhibits substantial modeling error and possible degradation of the estimation error convergence.
Neural Information Processing Systems
Jun-18-2026, 23:01:52 GMT
- Country:
- Asia (0.28)
- Genre:
- Overview (0.67)
- Research Report
- Experimental Study (1.00)
- New Finding (0.68)
- Technology: