Stay on topic with Classifier-Free Guidance
Sanchez, Guillaume, Fan, Honglu, Spangher, Alexander, Levi, Elad, Ammanamanchi, Pawan Sasanka, Biderman, Stella
–arXiv.org Artificial Intelligence
Classifier-Free Guidance (CFG) [37] has recently emerged in text-to-image generation as a lightweight technique to encourage prompt-adherence in generations. In this work, we demonstrate that CFG can be used broadly as an inference-time technique in pure language modeling. We show that CFG (1) improves the performance of Pythia, GPT-2 and LLaMA-family models across an array of tasks: Q&A, reasoning, code generation, and machine translation, achieving SOTA on LAMBADA with LLaMA-7B over PaLM-540B; (2) brings improvements equivalent to a model with twice the parameter-count; (3) can stack alongside other inference-time methods like Chain-of-Thought and Self-Consistency, yielding further improvements in difficult tasks; (4) can be used to increase the faithfulness and coherence of assistants in challenging form-driven and content-driven prompts: in a human evaluation we show a 75% preference for GPT4All using CFG over baseline.
arXiv.org Artificial Intelligence
Jun-30-2023
- Country:
- Africa > Côte d'Ivoire (0.14)
- Asia
- Japan > Honshū
- Chūbu > Toyama Prefecture
- Toyama (0.04)
- Kantō > Tokyo Metropolis Prefecture
- Tokyo (0.14)
- Chūbu > Toyama Prefecture
- Middle East
- Iraq (0.04)
- Israel > Tel Aviv District
- Tel Aviv (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Japan > Honshū
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- France > Île-de-France
- Germany (0.14)
- Greece (0.04)
- Italy (0.04)
- United Kingdom > England
- Greater London > London (0.04)
- Belgium > Brussels-Capital Region
- North America
- Canada
- United States
- California (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Maryland > Baltimore (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- Manhattan (0.04)
- Texas (0.04)
- Oceania > Australia
- Queensland (0.04)
- Genre:
- Personal (0.92)
- Research Report
- Experimental Study (0.67)
- New Finding (1.00)
- Industry:
- Education (1.00)
- Government (1.00)
- Health & Medicine
- Epidemiology (0.93)
- Therapeutic Area
- Immunology (1.00)
- Infections and Infectious Diseases (1.00)
- Information Technology (0.67)
- Leisure & Entertainment > Sports (1.00)
- Media
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks
- Deep Learning (1.00)
- Natural Language
- Chatbot (1.00)
- Large Language Model (1.00)
- Representation & Reasoning (1.00)
- Vision (1.00)
- Machine Learning > Neural Networks
- Information Technology > Artificial Intelligence