Best-of-Majority: Minimax-Optimal Strategy for Pass@$k$ Inference Scaling

Di, Qiwei, Ji, Kaixuan, Li, Xuheng, Zhao, Heyang, Gu, Quanquan

Oct-6-2025–arXiv.org Machine Learning

LLM inference often generates a batch of candidates for a prompt and selects one via strategies like majority voting or Best-of- N (BoN). For difficult tasks, this single-shot selection often underperforms. Consequently, evaluations commonly report Pass@$k$: the agent may submit up to $k$ responses, and only the best of them is used when computing regret. Motivated by this, we study inference scaling in the more general Pass@$k$ inference setting, and prove that neither majority voting nor BoN exhibits the desirable scaling with $k$ and the sampling budget $N$. Combining the advantages of majority voting and BoN, we propose a new inference strategy called Best-of-Majority (BoM), with a pivotal step that restricts the candidates to the responses with high frequency in the $N$ samples before selecting the top-$k$ rewards. We prove that when the sampling budget is $N=\tildeΩ(C^*)$, the regret of BoM is $O(ε_{\mathrm{opt}}+\sqrt{ε_{\mathrm{RM}}^2C^*/k})$, where $C^*$ is the coverage coefficient, $ε_{\mathrm{RM}}$ is the estimation error of the reward model, and $ε_{\mathrm{opt}}$ is the estimation error of reward at the optimal response. We further establish a matching lower bound, certifying that our algorithm is minimax optimal. Beyond optimality, BoM has a key advantage: unlike majority voting and BoN, its performance does not degrade when increasing $N$. Experimental results of inference on math problems show BoM outperforming both majority voting and BoN.

algorithm, arxiv preprint arxiv, inequality hold, (12 more...)

arXiv.org Machine Learning

Oct-6-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Italy
  - Calabria > Catanzaro Province > Catanzaro (0.04)
- North America > United States
  - California > Los Angeles County > Los Angeles (0.28)

Genre:
- Research Report (0.63)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language
    - Chatbot (0.92)
    - Large Language Model (1.00)
  - Representation & Reasoning > Search (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found