Kriegeskorte, Nikolaus
How does the primate brain combine generative and discriminative computations in vision?
Peters, Benjamin, DiCarlo, James J., Gureckis, Todd, Haefner, Ralf, Isik, Leyla, Tenenbaum, Joshua, Konkle, Talia, Naselaris, Thomas, Stachenfeld, Kimberly, Tavares, Zenna, Tsao, Doris, Yildirim, Ilker, Kriegeskorte, Nikolaus
Vision is widely understood as an inference problem. However, two contrasting conceptions of the inference process have each been influential in research on biological vision as well as the engineering of machine vision. The first emphasizes bottom-up signal flow, describing vision as a largely feedforward, discriminative inference process that filters and transforms the visual information to remove irrelevant variation and represent behaviorally relevant information in a format suitable for downstream functions of cognition and behavioral control. In this conception, vision is driven by the sensory data, and perception is direct because the processing proceeds from the data to the latent variables of interest. The notion of "inference" in this conception is that of the engineering literature on neural networks, where feedforward convolutional neural networks processing images are said to perform inference. The alternative conception is that of vision as an inference process in Helmholtz's sense, where the sensory evidence is evaluated in the context of a generative model of the causal processes giving rise to it. In this conception, vision inverts a generative model through an interrogation of the evidence in a process often thought to involve top-down predictions of sensory data to evaluate the likelihood of alternative hypotheses. The authors include scientists rooted in roughly equal numbers in each of the conceptions and motivated to overcome what might be a false dichotomy between them and engage the other perspective in the realm of theory and experiment. The primate brain employs an unknown algorithm that may combine the advantages of both conceptions. We explain and clarify the terminology, review the key empirical evidence, and propose an empirical research program that transcends the dichotomy and sets the stage for revealing the mysterious hybrid algorithm of primate vision.
The Topology and Geometry of Neural Representations
Lin, Baihan, Kriegeskorte, Nikolaus
A central question for neuroscience is how to characterize brain representations of perceptual and cognitive content. An ideal characterization should distinguish different functional regions with robustness to noise and idiosyncrasies of individual brains that do not correspond to computational differences. Previous studies have characterized brain representations by their representational geometry, which is defined by the representational dissimilarity matrix (RDM), a summary statistic that abstracts from the roles of individual neurons (or responses channels) and characterizes the discriminability of stimuli. Here we explore a further step of abstraction: from the geometry to the topology of brain representations. We propose topological representational similarity analysis (tRSA), an extension of representational similarity analysis (RSA) that uses a family of geo-topological summary statistics that generalizes the RDM to characterize the topology while de-emphasizing the geometry. We evaluate this new family of statistics in terms of the sensitivity and specificity for model selection using both simulations and functional MRI (fMRI) data. In the simulations, the ground truth is a data-generating layer representation in a neural network model and the models are the same and other layers in different model instances (trained from different random seeds). In fMRI, the ground truth is a visual area and the models are the same and other areas measured in different subjects. Results show that topology-sensitive characterizations of population codes are robust to noise and interindividual variability and maintain excellent sensitivity to the unique representational signatures of different neural network layers and brain regions.
Testing the limits of natural language models for predicting human language judgments
Golan, Tal, Siegelman, Matthew, Kriegeskorte, Nikolaus, Baldassano, Christopher
Neural network language models can serve as computational hypotheses about how humans process language. We compared the model-human consistency of diverse language models using a novel experimental approach: controversial sentence pairs. For each controversial sentence pair, two language models disagree about which sentence is more likely to occur in natural text. Considering nine language models (including n-gram, recurrent neural networks, and transformer models), we created hundreds of such controversial sentence pairs by either selecting sentences from a corpus or synthetically optimizing sentence pairs to be highly controversial. Human subjects then provided judgments indicating for each pair which of the two sentences is more likely. Controversial sentence pairs proved highly effective at revealing model failures and identifying models that aligned most closely with human judgments. The most human-consistent model tested was GPT-2, although experiments also revealed significant shortcomings of its alignment with human perception.
Distinguishing representational geometries with controversial stimuli: Bayesian experimental design and its application to face dissimilarity judgments
Golan, Tal, Guo, Wenxuan, Schรผtt, Heiko H., Kriegeskorte, Nikolaus
Comparing representations of complex stimuli in neural network layers to human brain representations or behavioral judgments can guide model development. However, even qualitatively distinct neural network models often predict similar representational geometries of typical stimulus sets. We propose a Bayesian experimental design approach to synthesizing stimulus sets for adjudicating among representational models efficiently. We apply our method to discriminate among candidate neural network models of behavioral face dissimilarity judgments. Our results indicate that a neural network trained to invert a 3D-face-model graphics renderer is more human-aligned than the same architecture trained on identification, classification, or autoencoding. Our proposed stimulus synthesis objective is generally applicable to designing experiments to be analyzed by representational similarity analysis for model comparison.
Adaptive Independence Tests with Geo-Topological Transformation
Lin, Baihan, Kriegeskorte, Nikolaus
Testing two potentially multivariate variables for statistical dependence on the basis finite samples is a fundamental statistical challenge. Here we explore a family of tests that adapt to the complexity of the relationship between the variables, promising robust power across scenarios. Building on the distance correlation, we introduce a family of adaptive independence criteria based on nonlinear monotonic transformations of distances. We show that these criteria, like the distance correlation and RKHS-based criteria, provide dependence indicators. We propose a class of adaptive (multi-threshold) test statistics, which form the basis for permutation tests. These tests empirically outperform some of the established tests in average and worst-case statistical sensitivity across a range of univariate and multivariate relationships and might deserve further exploration.
Visualizing Representational Dynamics with Multidimensional Scaling Alignment
Lin, Baihan, Mur, Marieke, Kietzmann, Tim, Kriegeskorte, Nikolaus
Representational similarity analysis (RSA) has been shown to be an effective framework to characterize brainactivity The scarcity of methods to characterize the representational profiles and deep neural network activations as dynamics creates a major barrier to answer interesting representational geometry by computing the pairwise questions such as: how are objects represented in the brain distances of the response patterns as a representational over the time course from early perception to categorical decision dissimilarity matrix (RDM). However, how to properly analyze making, does the object identification or visual categorization and visualize the representational geometry as dynamics follows a hierarchical classification paradigm; do different over the time course from stimulus onset to offset classes of objects merge and branch at different time is not well understood. In this work, we formulated points based on different tasks or recurrence paradigm; are the pipeline to understand representational dynamics these representational dynamics oscillatory or recurrent?