Goto

Collaborating Authors

 rosch


From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning

Shani, Chen, Soffer, Liron, Jurafsky, Dan, LeCun, Yann, Shwartz-Ziv, Ravid

arXiv.org Artificial Intelligence

Humans organize knowledge into compact conceptual categories that balance compression with semantic richness. Large Language Models (LLMs) exhibit impressive linguistic abilities, but whether they navigate this same compression-meaning trade-off remains unclear. We apply an Information Bottleneck framework to compare human conceptual structure with embeddings from 40+ LLMs using classic categorization benchmarks. We find that LLMs broadly align with human category boundaries, yet fall short on fine-grained semantic distinctions. Unlike humans, who maintain ``inefficient'' representations that preserve contextual nuance, LLMs aggressively compress, achieving more optimal information-theoretic compression at the cost of semantic richness. Surprisingly, encoder models outperform much larger decoder models in human alignment, suggesting that understanding and generation rely on distinct representational mechanisms. Training-dynamics analysis reveals a two-phase trajectory: rapid initial concept formation followed by architectural reorganization, during which semantic processing migrates from deep to mid-network layers as the model discovers increasingly efficient, sparser encodings. These divergent strategies, where LLMs optimize for compression and humans for adaptive utility, reveal fundamental differences between artificial and natural intelligence. This highlights the need for models that preserve the conceptual ``inefficiencies'' essential for human-like understanding.


Towards Ranking Schemas by Focus

Fumagalli, Mattia, Shi, Daqian, Giunchiglia, Fausto

arXiv.org Artificial Intelligence

The main goal of this paper is to evaluate knowledge base schemas, modeled as a set of entity types, each such type being associated with a set of properties, according to their focus. We intuitively model the notion of focus as ''the state or quality of being relevant in storing and retrieving information''. This definition of focus is adapted from the notion of ''categorization purpose'', as first defined in cognitive psychology, thus giving us a high level of understandability on the side of users. In turn, this notion is formalized based on a set of knowledge metrics that, for any given focus, rank knowledge base schemas according to their quality. We apply the proposed methodology to more than 200 state-of-the-art knowledge base schemas. The experimental results show the utility of our approach


Exploring Category Structure with Contextual Language Models and Lexical Semantic Networks

Renner, Joseph, Denis, Pascal, Gilleron, Rémi, Brunellière, Angèle

arXiv.org Artificial Intelligence

Recent work on predicting category structure with distributional models, using either static word embeddings (Heyman and Heyman, 2019) or contextualized language models (CLMs) (Misra et al., 2021), report low correlations with human ratings, thus calling into question their plausibility as models of human semantic memory. In this work, we revisit this question testing a wider array of methods for probing CLMs for predicting typicality scores. Our experiments, using BERT (Devlin et al., 2018), show the importance of using the right type of CLM probes, as our best BERT-based typicality prediction methods substantially improve over previous works. Second, our results highlight the importance of polysemy in this task: our best results are obtained when using a disambiguation mechanism. Finally, additional experiments reveal that Information Contentbased WordNet (Miller, 1995), also endowed with disambiguation, match the performance of the best BERT-based method, and in fact capture complementary information, which can be combined with BERT to achieve enhanced typicality predictions.


Is it a Fruit, an Apple or a Granny Smith? Predicting the Basic Level in a Concept Hierarchy

Hollink, Laura, Bilgin, Aysenur, van Ossenbruggen, Jacco

arXiv.org Artificial Intelligence

The "basic level", according to experiments in cognitive psychology, is the level of abstraction in a hierarchy of concepts at which humans perform tasks quicker and with greater accuracy than at other levels. We argue that applications that use concept hierarchies - such as knowledge graphs, ontologies or taxonomies - could significantly improve their user interfaces if they `knew' which concepts are the basic level concepts. This paper examines to what extent the basic level can be learned from data. We test the utility of three types of concept features, that were inspired by the basic level theory: lexical features, structural features and frequency features. We evaluate our approach on WordNet, and create a training set of manually labelled examples that includes concepts from different domains. Our findings include that the basic level concepts can be accurately identified within one domain. Concepts that are difficult to label for humans are also harder to classify automatically. Our experiments provide insight into how classification performance across domains could be improved, which is necessary for identification of basic level concepts on a larger scale.


Is anyone in AI/Machine Learning community working on realizing Daniel Dennett's Stances artificially? • /r/artificial

#artificialintelligence

The core idea is that, when understanding, explaining and/or predicting the behavior of an object, we can choose to view it at varying levels of abstraction. The more concrete the level, the more accurate in principle our predictions are; the more abstract, the greater the computational power we gain by zooming out and skipping over the irrelevant details. Dennett defines three levels of abstraction, attained by adopting one of three entirely different "stances", or intellectual strategies: the physical stance; the design stance; and the intentional stance: The most concrete is the physical stance, the domain of physics and chemistry, which makes predictions from knowledge of the physical constitution of the system and the physical laws that govern its operation; and thus, given a particular set of physical laws and initial conditions, and a particular configuration, a specific future state is predicted (this could also be called the "structure stance").[15] At this level, we are concerned with such things as mass, energy, velocity, and chemical composition. When we predict where a ball is going to land based on its current trajectory, we are taking the physical stance.