GPT Editors, Not Authors: The Stylistic Footprint of LLMs in Academic Preprints

DeHaan, Soren, Liu, Yuanze, Bollen, Johan, Blanco, Sa'ul A.

May-26-2025–arXiv.org Artificial Intelligence

The proliferation of Large Language Models (LLMs) in late 2022 has impacted academic writing, threatening credibility, and causing institutional uncertainty. We seek to determine the degree to which LLMs are used to generate critical text as opposed to being used for editing, such as checking for grammar errors or inappropriate phrasing. In our study, we analyze arXiv papers for stylistic segmentation, which we measure by varying a PELT threshold against a Bayesian classifier trained on GPT-regenerated text. We find that LLM-attributed language is not predictive of stylistic segmentation, suggesting that when authors use LLMs, they do so uniformly, reducing the risk of hallucinations being introduced into academic preprints.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

May-26-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.68)
- North America
  - United States (0.47)
  - Mexico (0.28)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Government > Regional Government (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (0.49)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.35)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found