Can structural correspondences ground real world representational content in Large Language Models?

Williams, Iwan

arXiv.org Artificial Intelligence 

Historically, these systems included purely statistical models, but modern LLMs are deep artificial neural network s trained via machine learning . Once trained, an LLM may be implemented for various purposes, such as in chatbot s and personal assistants, or for translation, sentiment analysis and document review. 2 T he indisputably impressive performance of LLMs on a wide variety of task raises pressing questions about their capacities, and the mechanisms underlying those capacities . For instance, authors have grapple d with the questions of whether LLMs understand language (Bender & Koller, 2020; Mitchell & Krakauer, 2022) whether they possess concepts (Butlin, 2023) or to what extent they possess a theory of mind (Kosinski, 2024; Ullman 2023) . This paper focuses on the representational capacities of LLMs . D o LLMs rely on representations? If so, what do those representations represent? Much r esearch in AI -- for instance, studies using p robing classifiers (Belinkov, 2022), and methods for " e diting " models' representations (Hernandez et al., 202 4; Meng et al., 2022) -- assume s that a representational lens is appropriate . But a key question is whether LLMs can represent real world entities, or only "shallow" linguistic contents that don't reach into extra - linguistic reality (Butlin, 2021; Coelho Mollo & Millière, 2023; Yildirim & Paul, 2024) .