On the Efficacy of Sampling Adapters

Meister, Clara, Pimentel, Tiago, Malagutti, Luca, Wilcox, Ethan G., Cotterell, Ryan

Jan-5-2024–arXiv.org Artificial Intelligence

Sampling is a common strategy for generating text from probabilistic models, yet standard ancestral sampling often results in text that is incoherent or ungrammatical. To alleviate this issue, various modifications to a model's sampling distribution, such as nucleus or top-k sampling, have been introduced and are now ubiquitously used in language generation systems. We propose a unified framework for understanding these techniques, which we term sampling adapters. Sampling adapters often lead to qualitatively better text, which raises the question: From a formal perspective, how are they changing the (sub)word-level distributions of language generation models? And why do these local changes lead to higher-quality text? We argue that the shift they enforce can be viewed as a trade-off between precision and recall: while the model loses its ability to produce certain strings, its precision rate on desirable text increases. While this trade-off is not reflected in standard metrics of distribution quality (such as perplexity), we find that several precision-emphasizing measures indeed indicate that sampling adapters can lead to probability distributions more aligned with the true distribution. Further, these measures correlate with higher sequence-level quality scores, specifically, Mauve.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Jan-5-2024

arXiv.org PDF

Add feedback

Country:
- Asia (0.67)
- Europe (1.00)
- North America > United States
  - Minnesota (0.28)

Genre:
- Research Report > New Finding (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.49)
  - Natural Language
    - Chatbot (0.71)
    - Generation (0.89)
    - Large Language Model (0.71)