Wilson, Robert
Centaur: a foundation model of human cognition
Binz, Marcel, Akata, Elif, Bethge, Matthias, Brändle, Franziska, Callaway, Fred, Coda-Forno, Julian, Dayan, Peter, Demircan, Can, Eckstein, Maria K., Éltető, Noémi, Griffiths, Thomas L., Haridi, Susanne, Jagadish, Akshay K., Ji-An, Li, Kipnis, Alexander, Kumar, Sreejan, Ludwig, Tobias, Mathony, Marvin, Mattar, Marcelo, Modirshanechi, Alireza, Nath, Surabhi S., Peterson, Joshua C., Rmus, Milena, Russek, Evan M., Saanum, Tankred, Scharfenberg, Natalia, Schubert, Johannes A., Buschoff, Luca M. Schulze, Singhi, Nishad, Sui, Xin, Thalmann, Mirko, Theis, Fabian, Truong, Vuong, Udandarao, Vishaal, Voudouris, Konstantinos, Wilson, Robert, Witte, Kristin, Wu, Shuchen, Wulff, Dirk, Xiong, Huadong, Schulz, Eric
Establishing a unified theory of cognition has been a major goal of psychology. While there have been previous attempts to instantiate such theories by building computational models, we currently do not have one model that captures the human mind in its entirety. Here we introduce Centaur, a computational model that can predict and simulate human behavior in any experiment expressible in natural language. We derived Centaur by finetuning a state-of-the-art language model on a novel, large-scale data set called Psych-101. Psych-101 reaches an unprecedented scale, covering trial-by-trial data from over 60,000 participants performing over 10,000,000 choices in 160 experiments. Centaur not only captures the behavior of held-out participants better than existing cognitive models, but also generalizes to new cover stories, structural task modifications, and entirely new domains. Furthermore, we find that the model's internal representations become more aligned with human neural activity after finetuning. Taken together, Centaur is the first real candidate for a unified model of human cognition. We anticipate that it will have a disruptive impact on the cognitive sciences, challenging the existing paradigm for developing computational models.
The Neural Costs of Optimal Control
Gershman, Samuel, Wilson, Robert
Optimal control entails combining probabilities and utilities. However, for most practical problems probability densities can be represented only approximately. Choosing an approximation requires balancing the benefits of an accurate approximation against the costs of computing it. We propose a variational framework for achieving this balance and apply it to the problem of how a population code should optimally represent a distribution under resource constraints. The essence of our analysis is the conjecture that population codes are organized to maximize a lower bound on the log expected utility.
The Neural Costs of Optimal Control
Gershman, Samuel, Wilson, Robert
Optimal control entails combining probabilities and utilities. However, for most practical problems, probability densities can be represented only approximately. Choosing an approximation requires balancing the benefits of an accurate approximation againstthe costs of computing it. We propose a variational framework for achieving this balance and apply it to the problem of how a neural population code should optimally represent a distribution under resource constraints. The essence of our analysis is the conjecture that population codes are organized to maximize a lower bound on the log expected utility. This theory can account for a plethora of experimental data, including the reward-modulation of sensory receptive fields, GABAergic effects on saccadic movements, and risk aversion in decisions under uncertainty.
A Neural Implementation of the Kalman Filter
Wilson, Robert, Finkel, Leif
There is a growing body of experimental evidence to suggest that the brain is capable of approximating optimal Bayesian inference in the face of noisy input stimuli. Despite this progress, the neural underpinnings of this computation are still poorly understood. In this paper we focus on the problem of Bayesian filtering of stochastic time series. In particular we introduce a novel neural network, derived from a line attractor architecture, whose dynamics map directly onto those of the Kalman Filter in the limit where the prediction error is small. When the prediction error is large we show that the network responds robustly to change-points in a way that is qualitatively compatible with the optimal Bayesian model. The model suggests ways in which probability distributions are encoded in the brain and makes a number of testable experimental predictions.