Goto

Collaborating Authors

 human language


Can animals read? Not in the human way.

Popular Science

A 2024 study found that cats learn to associate images with words faster than human babies. Breakthroughs, discoveries, and DIY tips sent every weekday. "My cat always watches my phone as I text or read a book," someone wrote on Reddit . "Even right now she is on my shoulder, intently watching what I am typing on this post. Can she read or is she just interested in what I am doing?"


Anti-efficient encoding in emergent communication

Neural Information Processing Systems

Despite renewed interest in emergent language simulations with neural networks, little is known about the basic properties of the induced code, and how they compare to human language. One fundamental characteristic of the latter, known as Zipf's Law of Abbreviation (ZLA), is that more frequent words are efficiently associated to shorter strings. We study whether the same pattern emerges when two neural networks, a listener'', are trained to play a signaling game. Surprisingly, we find that networks develop an \emph{anti-efficient} encoding scheme, in which the most frequent inputs are associated to the longest messages, and messages in general are skewed towards the maximum length threshold. This anti-efficient code appears easier to discriminate for the listener, and, unlike in human communication, the speaker does not impose a contrasting least-effort pressure towards brevity. Indeed, when the cost function includes a penalty for longer messages, the resulting message distribution starts respecting ZLA. Our analysis stresses the importance of studying the basic features of emergent communication in a highly controlled setup, to ensure the latter will not strand too far from human language. Moreover, we present a concrete illustration of how different functional pressures can lead to successful communication codes that lack basic properties of human language, thus highlighting the role such pressures play in the latter.


Trading off Utility, Informativeness, and Complexity in Emergent Communication

Neural Information Processing Systems

Emergent communication (EC) research often focuses on optimizing task-specific utility as a driver for communication. However, there is increasing evidence that human languages are shaped by task-general communicative constraints and evolve under pressure to optimize the Information Bottleneck (IB) tradeoff between the informativeness and complexity of the lexicon. Here, we integrate these two approaches by trading off utility, informativeness, and complexity in EC. To this end, we propose Vector-Quantized Variational Information Bottleneck (VQ-VIB), a method for training neural agents to encode inputs into discrete signals embedded in a continuous space. We evaluate our approach in multi-agent reinforcement learning settings and in color reference games and show that: (1) VQ-VIB agents can continuously adapt to changing communicative needs and, in the color domain, align with human languages; (2) the emergent VQ-VIB embedding spaces are semantically meaningful and perceptually grounded; and (3) encouraging informativeness leads to faster convergence rates and improved utility, both in VQ-VIB and in prior neural architectures for symbolic EC, with VQ-VIB achieving higher utility for any given complexity. This work offers a new framework for EC that is grounded in information-theoretic principles that are believed to characterize human language evolution and that may facilitate human-agent interaction.


For the First Time, AI Analyzes Language as Well as a Human Expert

WIRED

If language is what makes us human, what does it mean now that large language models have gained "metalinguistic" abilities? Among the myriad abilities that humans possess, which ones are uniquely human? Language has been a top candidate at least since Aristotle, who wrote that humanity was "the animal that has language." Even as large language models such as ChatGPT superficially replicate ordinary speech, researchers want to know if there are specific aspects of human language that simply have no parallels in the communication systems of other animals or artificially intelligent devices. In particular, researchers have been exploring the extent to which language models can reason about language itself.


Language models as tools for investigating the distinction between possible and impossible natural languages

Kallini, Julie, Potts, Christopher

arXiv.org Artificial Intelligence

December 5, 2025 Abstract We argue that language models (LMs) have strong potential as investigative tools for probing the distinction between possible and impossible natural languages and thus uncovering the inductive biases that support human language learning. We outline a phased research program in which LM architectures are iteratively refined to better discriminate between possible and impossible languages, supporting linking hypotheses to human cognition. Which conceivable linguistic systems are possible for humans to learn and use as natural languages? A complete answer to this question would yield profound insights into the human capacity for language. However, our tools for addressing the question are very limited.


Identifying Quantum Structure in AI Language: Evidence for Evolutionary Convergence of Human and Artificial Cognition

Aerts, Diederik, Arguëlles, Jonito Aerts, Beltran, Lester, Geriente, Suzette, Leporini, Roberto, de Bianchi, Massimiliano Sassoli, Sozzo, Sandro

arXiv.org Artificial Intelligence

We present the results of cognitive tests on conceptual combinations, performed using specific Large Language Models (LLMs) as test subjects. In the first test, performed with ChatGPT and Gemini, we show that Bell's inequalities are significantly violated, which indicates the presence of 'quantum entanglement' in the tested concepts. In the second test, also performed using ChatGPT and Gemini, we instead identify the presence of 'Bose-Einstein statistics', rather than the intuitively expected 'Maxwell-Boltzmann statistics', in the distribution of the words contained in large-size texts. Interestingly, these findings mirror the results previously obtained in both cognitive tests with human participants and information retrieval tests on large corpora. Taken together, they point to the 'systematic emergence of quantum structures in conceptual-linguistic domains', regardless of whether the cognitive agent is human or artificial. Although LLMs are classified as neural networks for historical reasons, we believe that a more essential form of knowledge organization takes place in the distributive semantic structure of vector spaces built on top of the neural network. It is this meaning-bearing structure that lends itself to a phenomenon of evolutionary convergence between human cognition and language, slowly established through biological evolution, and LLM cognition and language, emerging much more rapidly as a result of self-learning and training. We analyze various aspects and examples that contain evidence supporting the above hypothesis. We also advance a unifying framework that explains the pervasive quantum organization of meaning that we identify.


What enables human language? A biocultural framework Science

Science

Case study 1 considers vocal production learning, an organism's capacity to enlarge and modify its repertoire of vocalizations based on auditory experience. This ability is crucial for learning spoken language and limited in nonhuman primates but has emerged in other branches of the evolutionary tree, including subsets of birds, bats, elephants, cetaceans, and pinnipeds. Bringing together data from molecular investigations of speech and language disorders, genetic manipulations in animal models, and studies of ancient DNA, this case study demonstrates how ancient genetic and neural infrastructures may have been modified and recombined to enable distinctive human capacities. Case study 2 examines the emergence of linguistic structure, a defining property of human language, using data from real-world cases of emergence (e.g., homesign and emerging sign languages); experiments recreating cultural evolution in the lab; and comparative studies of nonhuman animals, including songbirds and primates. This case study highlights the importance of transmission and interaction, suggesting that emergence of structure involves a combination of biological, cognitive, and cultural conditions: Although some (or all) traits are shared with other species, their combination may be specific to humans.