A Fundamental Trade-off in Aligned Language Models and its Relation to Sampling Adaptors

Tan, Naaman, Valvoda, Josef, Svete, Anej, Liu, Tianyu, Qin, Yanxia, Min-Yen, Kan, Cotterell, Ryan

Jun-14-2024–arXiv.org Artificial Intelligence

The relationship between the quality of a string and its probability $p(\boldsymbol{y})$ under a language model has been influential in the development of techniques to build good text generation systems. For example, several decoding algorithms have been motivated to manipulate $p(\boldsymbol{y})$ to produce higher-quality text. In this work, we examine the probability--quality relationship in language models explicitly aligned to human preferences, e.g., through Reinforcement Learning through Human Feedback (RLHF). We find that, given a general language model and its aligned version, for corpora sampled from an aligned language model, there exists a trade-off between the average reward and average log-likelihood of the strings under the general language model. We provide a formal treatment of this issue and demonstrate how a choice of sampling adaptor allows for a selection of how much likelihood we exchange for the reward.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Jun-14-2024

arXiv.org PDF

Add feedback

Country:
- Asia
  - Middle East > UAE (0.14)
  - Singapore (0.14)
- Europe
  - Czechia (0.14)
  - Spain (0.14)
- North America > Canada (0.14)
- Oceania > Australia (0.14)

Genre:
- Research Report > New Finding (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.51)
  - Natural Language
    - Generation (0.66)
    - Large Language Model (0.47)
  - Representation & Reasoning > Uncertainty (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found