GLTR: Statistical Detection and Visualization of Generated Text
Gehrmann, Sebastian, Strobelt, Hendrik, Rush, Alexander M.
–arXiv.org Artificial Intelligence
The rapid improvement of language models has raised the specter of abuse of text generation systems. This progress motivates the development of simple methods for detecting generated text that can be used by and explained to non-experts. We develop GLTR, a tool to support humans in detecting whether a text was generated by a model. GLTR applies a suite of baseline statistical methods that can detect generation artifacts across common Figure 1: The top-k overlay within GLTR. It is easy sampling schemes. In a human-subjects study, to distinguish sampled from written text. The real text we show that the annotation scheme provided is from the Wikipedia page of The Great British Bake by GLTR improves the human detection-rate Off, the fake from GPT-2 large with temperature 0.7. of fake text from 54% to 72% without any prior training.
arXiv.org Artificial Intelligence
Jun-10-2019
- Genre:
- Research Report (0.84)
- Technology:
- Information Technology > Artificial Intelligence > Natural Language
- Chatbot (0.38)
- Generation (0.49)
- Large Language Model (0.38)
- Information Technology > Artificial Intelligence > Natural Language