Uncertainty-Aware Search and Value Models: Mitigating Search Scaling Flaws in LLMs
Yu, Fei, Li, Yingru, Wang, Benyou
–arXiv.org Artificial Intelligence
Value model-guided search is effective in steering the generation but suffers from scaling flaws: Its superiority diminishes with larger sample sizes, underperforming non-search baselines. This limitation arises from reliability degradation in value models in unseen reasoning paths. To address this, we propose an uncertainty-aware search framework that includes two key components: (1) uncertainty-aware value models that incorporate uncertainty into predictions, and (2) an uncertainty-aware selection process using the proposed efficient Group Thompson Sampling algorithm. Experiments on GSM8K show that our method mitigates search scaling flaws, achieving 90.5% coverage at 16 samples compared to 85.8% for conventional value-guided search. This work establishes the first systematic integration of uncertainty quantification in LLM search paradigms.
arXiv.org Artificial Intelligence
Feb-16-2025
- Country:
- Asia
- China
- Guangdong Province > Shenzhen (0.04)
- Hong Kong (0.04)
- Middle East > Jordan (0.04)
- China
- Europe
- North America
- Mexico > Mexico City
- Mexico City (0.04)
- United States > Louisiana
- Orleans Parish > New Orleans (0.04)
- Mexico > Mexico City
- Asia
- Genre:
- Research Report (0.82)
- Technology: