ATheoretical Study on Bridging Internal Probability and Self-Consistency for LLMReasoning

Jun-18-2026, 23:01:52 GMT–Neural Information Processing Systems

Test-time scaling seeks to improve the reasoning performance of large language models (LLMs) by adding computational resources. A prevalent approach within the field is sampling-based test-time scaling methods, which enhance reasoning by generating multiple reasoning paths for a given input during inference. However, despite its practical success, the theoretical foundations remain underexplored. In this paper, we provide the first theoretical framework for analyzing sampling-based test-time scaling methods, grounded in the perspective of confidence estimation. Based on the framework, we analyze two dominant paradigms: self-consistency and perplexity, and reveal key limitations: self-consistency suffers from high estimation error while perplexity exhibits substantial modeling error and possible degradation of the estimation error convergence.

artificial intelligence, large language model, natural language, (18 more...)

Neural Information Processing Systems

Jun-18-2026, 23:01:52 GMT

Conferences PDF

Add feedback

Country:
- Asia (0.28)

Genre:
- Overview (0.67)
- Research Report
  - Experimental Study (1.00)
  - New Finding (0.68)

Industry:
- Education (0.92)
- Energy (0.57)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found