BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling Lin Gui, Cristina Gârbacea 2, and Victor Veitch Department of Statistics, University of Chicago
–Neural Information Processing Systems
This paper concerns the problem of aligning samples from large language models to human preferences using best-of-n sampling, where we draw n samples, rank them, and return the best one. We consider two fundamental problems. First: what is the relationship between best-of-n and approaches to alignment that train LLMs to output samples with a high expected reward (e.g., RLHF or DPO)? To answer this, we embed both the best-of-n distribution and the sampling distributions learned by alignment procedures in a common class of tiltings of the base LLM distribution. We then show that, within this class, best-of-n is essentially optimal in terms of the trade-off between win-rate against the base model vs KL distance from the base model.
Neural Information Processing Systems
May-28-2025, 08:01:28 GMT
- Country:
- North America > United States > Illinois > Cook County > Chicago (0.40)
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Health & Medicine > Consumer Health (0.46)
- Technology: