GLTR: Statistical Detection and Visualization of Generated Text

Gehrmann, Sebastian, Strobelt, Hendrik, Rush, Alexander M.

Jun-10-2019–arXiv.org Artificial Intelligence

The rapid improvement of language models has raised the specter of abuse of text generation systems. This progress motivates the development of simple methods for detecting generated text that can be used by and explained to non-experts. We develop GLTR, a tool to support humans in detecting whether a text was generated by a model. GLTR applies a suite of baseline statistical methods that can detect generation artifacts across common Figure 1: The top-k overlay within GLTR. It is easy sampling schemes. In a human-subjects study, to distinguish sampled from written text. The real text we show that the annotation scheme provided is from the Wikipedia page of The Great British Bake by GLTR improves the human detection-rate Off, the fake from GPT-2 large with temperature 0.7. of fake text from 54% to 72% without any prior training.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

Jun-10-2019

arXiv.org PDF

Add feedback

Country:
- Europe > Russia > North Caucasian Federal District > Republic of Karelia > Petrozavodsk (0.04)

Genre:
- Research Report (0.84)

Industry:
- Media > News (0.47)

Technology:
- Information Technology > Artificial Intelligence > Natural Language
  - Generation (0.49)
  - Large Language Model (0.38)
  - Chatbot (0.37)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found