On the Efficacy of Sampling Adapters
Meister, Clara, Pimentel, Tiago, Malagutti, Luca, Wilcox, Ethan G., Cotterell, Ryan
–arXiv.org Artificial Intelligence
Sampling is a common strategy for generating text from probabilistic models, yet standard ancestral sampling often results in text that is incoherent or ungrammatical. To alleviate this issue, various modifications to a model's sampling distribution, such as nucleus or top-k sampling, have been introduced and are now ubiquitously used in language generation systems. We propose a unified framework for understanding these techniques, which we term sampling adapters. Sampling adapters often lead to qualitatively better text, which raises the question: From a formal perspective, how are they changing the (sub)word-level distributions of language generation models? And why do these local changes lead to higher-quality text? We argue that the shift they enforce can be viewed as a trade-off between precision and recall: while the model loses its ability to produce certain strings, its precision rate on desirable text increases. While this trade-off is not reflected in standard metrics of distribution quality (such as perplexity), we find that several precision-emphasizing measures indeed indicate that sampling adapters can lead to probability distributions more aligned with the true distribution. Further, these measures correlate with higher sequence-level quality scores, specifically, Mauve.
arXiv.org Artificial Intelligence
Jan-5-2024
- Country:
- Asia (0.67)
- Europe (1.00)
- North America > United States
- Minnesota (0.28)
- Genre:
- Research Report > New Finding (0.93)
- Technology: