RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment

Yang, Kevin, Klein, Dan, Celikyilmaz, Asli, Peng, Nanyun, Tian, Yuandong

Aug-18-2023–arXiv.org Artificial Intelligence

We propose Reinforcement Learning from Contrast Distillation (RLCD), a method for aligning language models to follow natural language principles without using human feedback. RLCD trains a preference model using simulated preference pairs that contain both a high-quality and low-quality example, generated using contrasting positive and negative prompts. The preference model is then used to improve a base unaligned language model via reinforcement learning. Empirically, RLCD outperforms RLAIF (Bai et al., 2022b) and context distillation (Huang et al., 2022) baselines across three diverse alignment tasks--harmlessness, helpfulness, and story outline generation--and on both 7B and 30B model scales for preference data simulation. Reinforcement Learning from Human Feedback (RLHF) has recently been used to great effect to align pretrained large language models (LLMs) to human preferences, optimizing for desirable qualities like harmlessness and helpfulness (Bai et al., 2022a) and achieving ...

machine learning, natural language, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

Aug-18-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States > New York (0.28)

Genre:
- Personal > Interview (1.00)
- Research Report (1.00)

Industry:
- Information Technology > Services (1.00)
- Law (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Leisure & Entertainment (0.67)
- Media > Film (0.67)
- Transportation > Ground
  - Road (0.45)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Reinforcement Learning (1.00)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found