Goto

Collaborating Authors

 cortex


He Couldn't Land a Job Interview. Was AI to Blame?

WIRED

Armed with some Python and a white-hot sense of injustice, one medical student spent six months trying to figure out whether an algorithm trashed his job application. It was mid-October, peak leaf-peeping season in Hanover, New Hampshire, and Chad Markey was on a rare break between clinical rotations during his last year of medical school. He should have been inhaling Green Mountain air and gossiping with his Dartmouth classmates about life after graduation. In a few months, they'd all be going their separate ways to start residency training at hospitals around the country. Instead, Markey was alone in his apartment, deep down a rabbit hole, preparing to go to war. He'd wake each morning, eat breakfast, open his laptop at the kitchen table or settle into the tan armchair with the good back support, and start coding . Some days, he wouldn't notice the sun had gone down until one of his roommates came home and asked why the lights weren't on. For days, Markey had been scrolling through a Discord group about medical residency, a font of crowdsourced knowledge where students report back to their peers on every stage of the application and selection process. He'd watched as other students, lots of them, posted about the interview invitations they'd received.


Brain encoding models based on multimodal transformers can transfer across language and vision

Neural Information Processing Systems

Encoding models have been used to assess how the human brain represents concepts in language and vision. While language and vision rely on similar concept representations, current encoding models are typically trained and tested on brain responses to each modality in isolation. Recent advances in multimodal pretraining have produced transformers that can extract aligned representations of concepts in language and vision. In this work, we used representations from multimodal transformers to train encoding models that can transfer across fMRI responses to stories and movies. We found that encoding models trained on brain responses to one modality can successfully predict brain responses to the other modality, particularly in cortical regions that represent conceptual meaning. Further analysis of these encoding models revealed shared semantic dimensions that underlie concept representations in language and vision. Comparing encoding models trained using representations from multimodal and unimodal transformers, we found that multimodal transformers learn more aligned representations of concepts in language and vision. Our results demonstrate how multimodal transformers can provide insights into the brain's capacity for multimodal processing.


A mechanistic multi-area recurrent network model of decision-making

Neural Information Processing Systems

Recurrent neural networks (RNNs) trained on neuroscience-based tasks have been widely used as models for cortical areas performing analogous tasks. However, very few tasks involve a single cortical area, and instead require the coordination of multiple brain areas. Despite the importance of multi-area computation, there is a limited understanding of the principles underlying such computation. We propose to use multi-area RNNs with neuroscience-inspired architecture constraints to derive key features of multi-area computation. In particular, we show that incorporating multiple areas and Dale's Law is critical for biasing the networks to learn biologically plausible solutions. Additionally, we leverage the full observability of the RNNs to show that output-relevant information is preferentially propagated between areas. These results suggest that cortex uses modular computation to generate minimal sufficient representations of task information. More broadly, our results suggest that constrained multi-area RNNs can produce experimentally testable hypotheses for computations that occur within and across multiple brain areas, enabling new insights into distributed computation in neural systems.


Descriptions

Neural Information Processing Systems

GloVE [25] is a 300-dimensional word embedding space. It is an dimensionality-representation representation of word-word co-occurrence statistics. BERT-E [10] is a 3072-dimensional contextualized word embedding space extracted from BERT. We used the Flair NLP [1] implementation of BERT embeddings. FLAIR [1] is a 4096-dimensional contextualized character level word embedding space.



Emergence of Hierarchical Layers in a Single Sheet of Self-Organizing Spiking Neurons

Neural Information Processing Systems

Traditionally convolutional neural network architectures have been designed by stacking layers on top of each other to form deeper hierarchical networks. The cortex in the brain however does not just stack layers as done in standard convolution neural networks, instead different regions are organized next to each other in a large single sheet of neurons. Biological neurons self organize to form topographic maps, where neurons encoding similar stimuli group together to form logical clusters. Here we propose new self-organization principles that allow for the formation of hierarchical cortical regions (i.e.


This tool could show how consciousness works

MIT Technology Review

Transcranial focused ultrasound is a noninvasive way to stimulate the brain and see how it functions. How does the physical matter in our brains translate into thoughts, sensations, and emotions? It's hard to explore that question without neurosurgery. But in a recent paper, MIT philosopher Matthias Michel, Lincoln Lab researcher Daniel Freeman, and colleagues outline a strategy for doing so with an emerging tool called transcranial focused ultrasound. This noninvasive technology reaches deeper into the brain, with greater resolution, than techniques such as EEG and MRI. It works by sending acoustic waves through the skull to focus on an area of a few millimeters, allowing specific brain structures to be stimulated so the effects can be studied.


Automatic Neuron Detection in Calcium Imaging Data Using Convolutional Networks

Neural Information Processing Systems

Calcium imaging is an important technique for monitoring the activity of thousands of neurons simultaneously. As calcium imaging datasets grow in size, automated detection of individual neurons is becoming important. Here we apply a supervised learning approach to this problem and show that convolutional networks can achieve near-human accuracy and superhuman speed. Accuracy is superior to the popular PCA/ICA method based on precision and recall relative to ground truth annotation by a human expert. These results suggest that convolutional networks are an efficient and flexible tool for the analysis of large-scale calcium imaging data.


Contrastive-Equivariant Self-Supervised Learning Improves Alignment with Primate Visual Area IT

Neural Information Processing Systems

Models trained with self-supervised learning objectives have recently matched or surpassed models trained with traditional supervised object recognition in their ability to predict neural responses of object-selective neurons in the primate visual system. A self-supervised learning objective is arguably a more biologically plausible organizing principle, as the optimization does not require a large number of labeled examples. However, typical self-supervised objectives may result in network representations that are overly invariant to changes in the input. Here, we show that a representation with structured variability to the input transformations is better aligned with known features of visual perception and neural computation. We introduce a novel framework for converting standard invariant SSL losses into contrastive-equivariant versions that encourage preserving aspects of the input transformation without supervised access to the transformation parameters. We further demonstrate that our proposed method systematically increases models' ability to predict responses in macaque inferior temporal cortex. Our results demonstrate the promise of incorporating known features of neural computation into task-optimization for building better models of visual cortex.


Benchmarking Out-of-Distribution Generalization Capabilities of DNN-based Encoding Models for the Ventral Visual Cortex.

Neural Information Processing Systems

We characterized the generalization capabilities of deep neural network encoding models when predicting neuronal responses from the visual cortex to flashed images. We collected MacaqueITBench, a large-scale dataset of neuronal population responses from the macaque inferior temporal (IT) cortex to over $300,000$ images, comprising $8,233$ unique natural images presented to seven monkeys over $109$ sessions. Using MacaqueITBench, we investigated the impact of distribution shifts on models predicting neuronal activity by dividing the images into Out-Of-Distribution (OOD) train and test splits. The OOD splits included variations in image contrast, hue, intensity, temperature, and saturation. Compared to the performance on in-distribution test images---the conventional way in which these models have been evaluated---models performed worse at predicting neuronal responses to out-of-distribution images, retaining as little as $20\\%$ of the performance on in-distribution test images. Additionally, the relative ranking of different models in terms of their ability to predict neuronal responses changed drastically across OOD shifts. The generalization performance under OOD shifts can be well accounted by a simple image similarity metric---the cosine distance between image representations extracted from a pre-trained object recognition model is a strong predictor of neuronal predictivity under different distribution shifts.