Model Criticism for Long-Form Text Generation

Deng, Yuntian, Kuleshov, Volodymyr, Rush, Alexander M.

Oct-16-2022–arXiv.org Artificial Intelligence

Language models have demonstrated the ability to generate highly fluent text; however, it remains unclear whether their output retains coherent high-level structure (e.g., story progression). Here, we propose to apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of the generated text. Model criticism compares the distributions between real and generated data in a latent space obtained according to an assumptive generative process. Different generative processes identify specific failure modes of the underlying model. We perform experiments on three representative aspects of high-level discourse -- coherence, coreference, and topicality -- and find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Oct-16-2022

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia (0.04)
- North America
  - Dominican Republic (0.04)
  - United States
    - Pennsylvania > Dauphin County (0.04)
    - New York > Dutchess County (0.04)
    - California (0.04)
    - Washington > King County
      - Seattle (0.04)
    - South Carolina > Charleston County
      - Mount Pleasant (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Michigan > Washtenaw County
      - Ann Arbor (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
- Europe
  - Austria (0.04)
  - Sweden (0.04)
  - Czechia > Prague (0.04)
  - United Kingdom
    - Scotland > City of Edinburgh
      - Edinburgh (0.04)
    - England > Cambridgeshire
      - Cambridge (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China (0.04)

Genre:
- Research Report (1.00)

Industry:
- Government > Regional Government > North America Government > United States Government (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.89)
  - Machine Learning
    - Statistical Learning (1.00)
    - Neural Networks > Deep Learning (1.00)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found