Goto

Collaborating Authors

 beamsearch


NeuralSequenceModels

Neural Information Processing Systems

All of the questions posed in Table 1in the main paper can be decomposed into readily available components that our modelpθ can estimate. Q1 P (X1) is already naturally in a form that our model can directly estimate due to the autoregressive factorization imposed by the architecture:p θ(X1). Q3 The "hitting time" or the next occurrence of a specific event typea V is defined asτ(a). Interestingly, we can see thatQ3 is a generalization ofQ2 by noting that they are identical when A={}. In practice, computing this exactly is intractable due to it being an infinite sum.


Don't Throw Away Your Beams: Improving Consistency-based Uncertainties in LLMs via Beam Search

Fadeeva, Ekaterina, Goloburda, Maiya, Rubashevskii, Aleksandr, Vashurin, Roman, Shelmanov, Artem, Nakov, Preslav, Sachan, Mrinmaya, Panov, Maxim

arXiv.org Machine Learning

Consistency-based methods have emerged as an effective approach to uncertainty quantification (UQ) in large language models. These methods typically rely on several generations obtained via multinomial sampling, measuring their agreement level. However, in short-form QA, multinomial sampling is prone to producing duplicates due to peaked distributions, and its stochasticity introduces considerable variance in uncertainty estimates across runs. We introduce a new family of methods that employ beam search to generate candidates for consistency-based UQ, yielding improved performance and reduced variance compared to multinomial sampling. We also provide a theoretical lower bound on the beam set probability mass under which beam search achieves a smaller error than multinomial sampling. We empirically evaluate our approach on six QA datasets and find that its consistent improvements over multinomial sampling lead to state-of-the-art UQ performance.