Whose LLM is it Anyway? Linguistic Comparison and LLM Attribution for GPT-3.5, GPT-4 and Bard

Feb-22-2024–arXiv.org Artificial Intelligence

Large Language Models (LLMs), such as GPT-3.5 [25], GPT-4 [1] and Bard [21], have revolutionized and popularized natural language processing and AI, demonstrating human-like and super-human performance in a wide range of text-based tasks [37]. While the layman may find the responses of LLMs hard to distinguish from human-generated ones [26, 3], a plethora of recent literature has shown that it is possible to successfully discern human-generated text from LLM-generated text using various computational techniques [30, 5, 15]. Among the developed techniques, the linguistic approach, which focuses on the structure, patterns, and nuances inherent in human language, stands out as a promising option that offers both high statistical performance [14] as well as theoretically-grounded explanatory power [23], as opposed to alternative "black-box" machine-learning techniques [2, 29]. Indeed, recent literature has shown that human and LLM-generated texts are, generally speaking, linguistically different across a wide variety of tasks and datasets including news reporting [23], hotel reviewing [9], essay writing [14] and scientific communication [6] to name a few. Common to these and similar studies is the observation that LLM-generated texts tend to be extensive and comprehensive, highly organized, follow a logical structure or formally stated, and present higher objectivity and lower prevalence of bias and harmful content compared to human-generated texts [34]. Extensive research into human-generated texts has consistently demonstrated the inherent diversity in human writing styles, resulting in distinct linguistic patterns, structures, and nuances [27, 22, 28].

gpt-3, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Feb-22-2024

arXiv.org PDF

Add feedback

Country:
- Europe
  - Italy (0.14)
  - Portugal (0.14)

Genre:
- Research Report > New Finding (0.94)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found