Uncertainty-Aware Search and Value Models: Mitigating Search Scaling Flaws in LLMs

Feb-16-2025–arXiv.org Artificial Intelligence

Value model-guided search is effective in steering the generation but suffers from scaling flaws: Its superiority diminishes with larger sample sizes, underperforming non-search baselines. This limitation arises from reliability degradation in value models in unseen reasoning paths. To address this, we propose an uncertainty-aware search framework that includes two key components: (1) uncertainty-aware value models that incorporate uncertainty into predictions, and (2) an uncertainty-aware selection process using the proposed efficient Group Thompson Sampling algorithm. Experiments on GSM8K show that our method mitigates search scaling flaws, achieving 90.5% coverage at 16 samples compared to 85.8% for conventional value-guided search. This work establishes the first systematic integration of uncertainty quantification in LLM search paradigms.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Feb-16-2025

arXiv.org PDF

Add feedback

Country:
- North America > Mexico (0.28)
- Asia > China (0.28)
- Europe > Austria
  - Vienna (0.14)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language > Large Language Model (0.85)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found