Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models

Vilnis, Luke, Zemlyanskiy, Yury, Murray, Patrick, Passos, Alexandre, Sanghai, Sumit

Jun-1-2023–arXiv.org Artificial Intelligence

Methods such as beam search and Gumbel top-k sampling can guarantee a different output for each element of the beam, but are not easy to parallelize. Alternatively, methods such as temperature sampling and its modifications (top-k sampling, nucleus sampling, typical decoding, and others), are embarrassingly parallel, but have no guarantees about duplicate samples. We present a framework for sampling according to an arithmetic code book implicitly defined by a large language model, compatible with common sampling variations, with provable beam diversity under certain conditions, as well as being embarrassingly parallel and providing unbiased and consistent expectations from the original model. We demonstrate the effectiveness of Figure 1: Sequence model over sequences of length two our approach on WMT machine translation, and a vocabulary of three symbols mapping points in the more than halving the standard deviation when unit interval to each sequence. An even lattice of code estimating expected BLEU score reward, and points parallelizes decoding into diverse high-probability closing the BLEU score gap between independent sequences.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Jun-1-2023

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America
  - Dominican Republic (0.04)
  - United States > Hawaii
    - Honolulu County > Honolulu (0.04)
  - Puerto Rico > San Juan
    - San Juan (0.04)

Genre:
- Research Report (0.84)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (1.00)
  - Natural Language
    - Machine Translation (0.88)
    - Large Language Model (0.73)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found