Imperfect Language, Artificial Intelligence, and the Human Mind: An Interdisciplinary Approach to Linguistic Errors in Native Spanish Speakers
–arXiv.org Artificial Intelligence
Linguistic errors are not merely deviations from normative grammar; they offer a unique window into the cognitive architecture of language and expose the current limitations of artificial systems that seek to replicate them. This project proposes an interd isciplinary study of linguistic errors produced by native Spanish speakers, with the aim of analyzing how current large language models (LLM) interpret, reproduce, or correct them. The research integrates three core perspectives: theoretical linguistics, to classify and understand the nature of the errors; neurolinguistics, to contextualize them within real - time language processing in the brain; and natural language processing (NLP), to evaluate their interpretation against linguistic errors. A purpose - built corpus of authentic errors of native Spanish (+500) will serve as the foundation for empirical analysis. These errors will be tested against AI models such as GPT or Gemini to assess their interpretative accuracy and their ability to generalize patterns of human linguistic behavior. The project contributes not only to the understanding of Spanish as a native language but also to the development of NLP systems that are more cognitively informed and capable of engaging with the imperfect, variable, and often ambiguous nature of real hum an language. In recent years, the development of large language models (LLMs) such as GPT - 4 and Gemini has brought a revolution in the field of natural language processing (NLP). These models, based on transformer architectures ( Vaswani et al., 2017), have demonstrated unprecedented abilities to generate coherent text, perform automatic translation, and produce complex summaries. Their impressive performance has transformed many applications, from chatbots and virtual assistants to automated content creation and languag e learning tools. However, despite these technological advances, LLMs still face significant challenges rooted in the inherently complex, ambiguous, and variable nature of real human language ( Bender et al., 2021). Particularly, irregularities, ambiguities, and errors commonly found in informal and spontaneous contexts, such as everyday conversations or social media interactions, constitute a major obstacle for the optimal functioning of these systems. Human language is not a rigid or perfectly normative system; rather, it is a dynamic phenomenon that reflects complex cognitive processes and is characterized by variations and errors in production and comprehension ( Levelt, 1989). Linguistic errors produced even by native speakers should not be dismissed as mere random deviations but regarded as systematic manifestations that can provide valuable insight into the internal functioning of the linguistic system and its neurological foundations ( Fromkin, 2013) .
arXiv.org Artificial Intelligence
Nov-4-2025
- Country:
- Asia > China (0.04)
- North America > United States
- Illinois > Cook County
- Chicago (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- New York (0.04)
- Illinois > Cook County
- Genre:
- Research Report (1.00)
- Industry:
- Health & Medicine > Therapeutic Area > Neurology (0.46)
- Technology: