Whose LLM is it Anyway? Linguistic Comparison and LLM Attribution for GPT-3.5, GPT-4 and Bard
Rosenfeld, Ariel, Lazebnik, Teddy
–arXiv.org Artificial Intelligence
Large Language Models (LLMs), such as GPT-3.5 [25], GPT-4 [1] and Bard [21], have revolutionized and popularized natural language processing and AI, demonstrating human-like and super-human performance in a wide range of text-based tasks [37]. While the layman may find the responses of LLMs hard to distinguish from human-generated ones [26, 3], a plethora of recent literature has shown that it is possible to successfully discern human-generated text from LLM-generated text using various computational techniques [30, 5, 15]. Among the developed techniques, the linguistic approach, which focuses on the structure, patterns, and nuances inherent in human language, stands out as a promising option that offers both high statistical performance [14] as well as theoretically-grounded explanatory power [23], as opposed to alternative "black-box" machine-learning techniques [2, 29]. Indeed, recent literature has shown that human and LLM-generated texts are, generally speaking, linguistically different across a wide variety of tasks and datasets including news reporting [23], hotel reviewing [9], essay writing [14] and scientific communication [6] to name a few. Common to these and similar studies is the observation that LLM-generated texts tend to be extensive and comprehensive, highly organized, follow a logical structure or formally stated, and present higher objectivity and lower prevalence of bias and harmful content compared to human-generated texts [34]. Extensive research into human-generated texts has consistently demonstrated the inherent diversity in human writing styles, resulting in distinct linguistic patterns, structures, and nuances [27, 22, 28].
arXiv.org Artificial Intelligence
Feb-22-2024