Residual Energy-Based Models for Text

Bakhtin, Anton, Deng, Yuntian, Gross, Sam, Ott, Myle, Ranzato, Marc'Aurelio, Szlam, Arthur

Dec-21-2020–arXiv.org Machine Learning

Current large-scale auto-regressive language models (Radford et al., 2019; Liu et al., 2018; Graves, 2013) display impressive fluency and can generate convincing text. In this work we start by asking the question: Can the generations of these models be reliably distinguished from real text by statistical discriminators? We find experimentally that the answer is affirmative when we have access to the training data for the model, and guardedly affirmative even if we do not. This suggests that the auto-regressive models can be improved by incorporating the (globally normalized) discriminators into the generative process. We give a formalism for this using the Energy-Based Model framework, and show that it indeed improves the results of the generative models, measured both in terms of perplexity and in terms of human evaluation.

discriminator, language model, residual energy-based model, (16 more...)

arXiv.org Machine Learning

Dec-21-2020

arXiv.org PDF

Add feedback

Country:
- North America
  - Mexico (0.04)
  - United States
    - Ohio > Cuyahoga County
      - Lakewood (0.04)
    - New York
      - Suffolk County > Stony Brook (0.04)
      - New York County > New York City (0.04)
    - Massachusetts > Middlesex County
      - Cambridge (0.04)
    - Illinois > Cook County
      - Chicago (0.04)
    - California > Santa Clara County
      - Palo Alto (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Asia > India
  - NCT > New Delhi (0.04)

Genre:
- Research Report (0.64)
- Instructional Material (0.45)

Industry:
- Education (1.00)
- Banking & Finance (0.67)
- Information Technology (0.67)
- Health & Medicine (0.67)
- Leisure & Entertainment > Sports
  - Football (1.00)
- Government > Regional Government
  - North America Government > United States Government (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Performance Analysis > Accuracy (0.67)
    - Statistical Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found