Kumar, Sreejan
Centaur: a foundation model of human cognition
Binz, Marcel, Akata, Elif, Bethge, Matthias, Brändle, Franziska, Callaway, Fred, Coda-Forno, Julian, Dayan, Peter, Demircan, Can, Eckstein, Maria K., Éltető, Noémi, Griffiths, Thomas L., Haridi, Susanne, Jagadish, Akshay K., Ji-An, Li, Kipnis, Alexander, Kumar, Sreejan, Ludwig, Tobias, Mathony, Marvin, Mattar, Marcelo, Modirshanechi, Alireza, Nath, Surabhi S., Peterson, Joshua C., Rmus, Milena, Russek, Evan M., Saanum, Tankred, Scharfenberg, Natalia, Schubert, Johannes A., Buschoff, Luca M. Schulze, Singhi, Nishad, Sui, Xin, Thalmann, Mirko, Theis, Fabian, Truong, Vuong, Udandarao, Vishaal, Voudouris, Konstantinos, Wilson, Robert, Witte, Kristin, Wu, Shuchen, Wulff, Dirk, Xiong, Huadong, Schulz, Eric
Establishing a unified theory of cognition has been a major goal of psychology. While there have been previous attempts to instantiate such theories by building computational models, we currently do not have one model that captures the human mind in its entirety. Here we introduce Centaur, a computational model that can predict and simulate human behavior in any experiment expressible in natural language. We derived Centaur by finetuning a state-of-the-art language model on a novel, large-scale data set called Psych-101. Psych-101 reaches an unprecedented scale, covering trial-by-trial data from over 60,000 participants performing over 10,000,000 choices in 160 experiments. Centaur not only captures the behavior of held-out participants better than existing cognitive models, but also generalizes to new cover stories, structural task modifications, and entirely new domains. Furthermore, we find that the model's internal representations become more aligned with human neural activity after finetuning. Taken together, Centaur is the first real candidate for a unified model of human cognition. We anticipate that it will have a disruptive impact on the cognitive sciences, challenging the existing paradigm for developing computational models.
Using Contrastive Learning with Generative Similarity to Learn Spaces that Capture Human Inductive Biases
Marjieh, Raja, Kumar, Sreejan, Campbell, Declan, Zhang, Liyi, Bencomo, Gianluca, Snell, Jake, Griffiths, Thomas L.
Humans rely on strong inductive biases to learn from few examples and abstract useful information from sensory data. Instilling such biases in machine learning models has been shown to improve their performance on various benchmarks including few-shot learning, robustness, and alignment. However, finding effective training procedures to achieve that goal can be challenging as psychologically-rich training data such as human similarity judgments are expensive to scale, and Bayesian models of human inductive biases are often intractable for complex, realistic domains. Here, we address this challenge by introducing a Bayesian notion of generative similarity whereby two datapoints are considered similar if they are likely to have been sampled from the same distribution. This measure can be applied to complex generative processes, including probabilistic programs. We show that generative similarity can be used to define a contrastive learning objective even when its exact form is intractable, enabling learning of spatial embeddings that express specific inductive biases. We demonstrate the utility of our approach by showing how it can be used to capture human inductive biases for geometric shapes, and to better distinguish different abstract drawing styles that are parameterized by probabilistic programs.
Human-Like Geometric Abstraction in Large Pre-trained Neural Networks
Campbell, Declan, Kumar, Sreejan, Giallanza, Tyler, Griffiths, Thomas L., Cohen, Jonathan D.
Specifically, we apply that can capture regularities in the external world. By neural network models to behavioral tasks from recent empirical forming abstractions that can generalize to future experience, work (Sablé-Meyer et al., 2021, 2022; Hsu, Wu, & humans are able to exhibit efficient learning and strong generalization Goodman, 2022) that catalogue three effects indicative of abstraction across domains (Lake, Salakhutdinov, & Tenenbaum, in human geometric reasoning. First, humans are 2015; Hull, 1920). One domain in which this has sensitive to geometric complexity, such that they are slower been observed by cognitive scientists is geometric reasoning to recall complex images as compared to simpler ones (Sablé- (Dehaene, Al Roumi, Lakretz, Planton, & Sablé-Meyer, Meyer et al., 2022). Second, humans are sensitive to geometric 2022), where people consistently extract abstract concepts, regularity (based on features such as right angles, parallel such as parallelism, symmetry, and convexity, that generalize sides, and symmetry) such that they are able to classify regular across many visual instances.
Comparing Abstraction in Humans and Large Language Models Using Multimodal Serial Reproduction
Kumar, Sreejan, Marjieh, Raja, Zhang, Byron, Campbell, Declan, Hu, Michael Y., Bhatt, Umang, Lake, Brenden, Griffiths, Thomas L.
Humans extract useful abstractions of the world from noisy sensory data. Serial reproduction allows us to study how people construe the world through a paradigm similar to the game of telephone, where one person observes a stimulus and reproduces it for the next to form a chain of reproductions. Past serial reproduction experiments typically employ a single sensory modality, but humans often communicate abstractions of the world to each other through language. To investigate the effect language on the formation of abstractions, we implement a novel multimodal serial reproduction framework by asking people who receive a visual stimulus to reproduce it in a linguistic format, and vice versa. We ran unimodal and multimodal chains with both humans and GPT-4 and find that adding language as a modality has a larger effect on human reproductions than GPT-4's. This suggests human visual and linguistic representations are more dissociable than those of GPT-4.
Learning to Abstract Visuomotor Mappings using Meta-Reinforcement Learning
Velazquez-Vargas, Carlos A., Christian, Isaac Ray, Taylor, Jordan A., Kumar, Sreejan
We investigated the human capacity to acquire multiple visuomotor mappings for de novo skills. Using a grid navigation paradigm, we tested whether contextual cues implemented as different "grid worlds", allow participants to learn two distinct key-mappings more efficiently. Our results indicate that when contextual information is provided, task performance is significantly better. The same held true for meta-reinforcement learning agents that differed in whether or not they receive contextual information when performing the task. We evaluated their accuracy in predicting human performance in the task and analyzed their internal representations. The results indicate that contextual cues allow the formation of separate representations in space and time when using different visuomotor mappings, whereas the absence of them favors sharing one representation. While both strategies can allow learning of multiple visuomotor mappings, we showed contextual cues provide a computational advantage in terms of how many mappings can be learned.
Disentangling Abstraction from Statistical Pattern Matching in Human and Machine Learning
Kumar, Sreejan, Dasgupta, Ishita, Daw, Nathaniel D., Cohen, Jonathan D., Griffiths, Thomas L.
The ability to acquire abstract knowledge is a hallmark of human intelligence and is believed by many to be one of the core differences between humans and neural network models. Agents can be endowed with an inductive bias towards abstraction through meta-learning, where they are trained on a distribution of tasks that share some abstract structure that can be learned and applied. However, because neural networks are hard to interpret, it can be difficult to tell whether agents have learned the underlying abstraction, or alternatively statistical patterns that are characteristic of that abstraction. In this work, we compare the performance of humans and agents in a meta-reinforcement learning paradigm in which tasks are generated from abstract rules. We define a novel methodology for building "task metamers" that closely match the statistics of the abstract tasks but use a different underlying generative process, and evaluate performance on both abstract and metamer tasks. We find that humans perform better at abstract tasks than metamer tasks whereas common neural network architectures typically perform worse on the abstract tasks than the matched metamers. This work provides a foundation for characterizing differences between humans and machine learning that can be used in future work towards developing machines with more human-like behavior.
Using Natural Language and Program Abstractions to Instill Human Inductive Biases in Machines
Kumar, Sreejan, Correa, Carlos G., Dasgupta, Ishita, Marjieh, Raja, Hu, Michael Y., Hawkins, Robert D., Daw, Nathaniel D., Cohen, Jonathan D., Narasimhan, Karthik, Griffiths, Thomas L.
Strong inductive biases give humans the ability to quickly learn to perform a variety of tasks. Although meta-learning is a method to endow neural networks with useful inductive biases, agents trained by meta-learning may sometimes acquire very different strategies from humans. We show that co-training these agents on predicting representations from natural language task descriptions and programs induced to generate such tasks guides them toward more human-like inductive biases. Human-generated language descriptions and program induction models that add new learned primitives both contain abstract concepts that can compress description length. Co-training on these representations result in more human-like behavior in downstream meta-reinforcement learning agents than less abstract controls (synthetic language descriptions, program induction without learned primitives), suggesting that the abstraction supported by these representations is key.
Meta-Learning of Compositional Task Distributions in Humans and Machines
Kumar, Sreejan, Dasgupta, Ishita, Cohen, Jonathan D., Daw, Nathaniel D., Griffiths, Thomas L.
Modern machine learning systems struggle with sample efficiency and are usually trained with enormous amounts of data for each task. This is in sharp contrast with humans, who often learn with very little data. In recent years, meta-learning, in which one trains on a family of tasks (i.e. a task distribution), has emerged as an approach to improving the sample complexity of machine learning systems and to closing the gap between human and machine learning. However, in this paper, we argue that current meta-learning approaches still differ significantly from human learning. We argue that humans learn over tasks by constructing compositional generative models and using these to generalize, whereas current meta-learning methods are biased toward the use of simpler statistical patterns. To highlight this difference, we construct a new meta-reinforcement learning task with a compositional task distribution. We also introduce a novel approach to constructing a "null task distribution" with the same statistical complexity as the compositional distribution but without explicit compositionality. We train a standard meta-learning agent, a recurrent network trained with model-free reinforcement learning, and compare it with human performance across the two task distributions. We find that humans do better in the compositional task distribution whereas the agent does better in the non-compositional null task distribution -- despite comparable statistical complexity. This work highlights a particular difference between human learning and current meta-learning models, introduces a task that displays this difference, and paves the way for future work on human-like meta-learning.