Optimizing Reasoning Efficiency through Prompt Difficulty Prediction

Zhao, Bo, Kapusuzoglu, Berkcan, Balasubramaniam, Kartik, Sahu, Sambit, Chakraborty, Supriyo, Winata, Genta Indra

Nov-7-2025–arXiv.org Artificial Intelligence

Reasoning language models perform well on complex tasks but are costly to deploy due to their size and long reasoning traces. We propose a routing approach that assigns each problem to the smallest model likely to solve it, reducing compute without sacrificing accuracy. Using intermediate representations from s1.1-32B, we train lightweight predictors of problem difficulty or model correctness to guide routing across a pool of reasoning models. On diverse math benchmarks, routing improves efficiency over random assignment and matches s1.1-32B's performance while using significantly less compute. Our results demonstrate that difficulty-aware routing is effective for cost-efficient deployment of reasoning models.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Nov-7-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.69)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.32)
  - Natural Language > Large Language Model (0.77)