visual area
- North America > United States (0.14)
- North America > Canada (0.04)
- Asia > China > Zhejiang Province (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Transformer brain encoders explain human high-level visual responses
Adeli, Hossein, Minni, Sun, Kriegeskorte, Nikolaus
A major goal of neuroscience is to understand brain computations during visual processing in naturalistic settings. A dominant approach is to use image-computable deep neural networks trained with different task objectives as a basis for linear encoding models. However, in addition to requiring estimation of a large number of linear encoding parameters, this approach ignores the structure of the feature maps both in the brain and the models. Recently proposed alternatives factor the linear mapping into separate sets of spatial and feature weights, thus finding static receptive fields for units, which is appropriate only for early visual areas. In this work, we employ the attention mechanism used in the transformer architecture to study how retinotopic visual features can be dynamically routed to category-selective areas in high-level visual processing. We show that this computational motif is significantly more powerful than alternative methods in predicting brain activity during natural scene viewing, across different feature basis models and modalities. We also show that this approach is inherently more interpretable as the attention-routing signals for different high-level categorical areas can be easily visualized for any input image. Given its high performance at predicting brain responses to novel images, the model deserves consideration as a candidate mechanistic model of how visual information from retinotopic maps is routed in the human brain based on the relevance of the input content to different category-selective regions.
- North America > United States (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Asia > Middle East > Israel (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- North America > United States (0.14)
- North America > Canada (0.04)
- Asia > China > Zhejiang Province (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- South America > Peru > Loreto Department (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
- Information Technology > Artificial Intelligence > Vision (0.69)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
A Representation Descriptions GloVE
It is an dimensionality-representation representation of word-word co-occurrence statistics. We used the Flair NLP [1] implementation of BERT embeddings. NWE is the GloV erepresentation, offset by one word in the future. Representations were built using a sliding window of 64 words as a context. HuggingFace [41] implementation of this network to extract feature for these representations.
- Europe > Finland > Uusimaa > Helsinki (0.06)
- North America > United States > California > Santa Clara County > Palo Alto (0.05)