Jain, Shailee
A generative framework to bridge data-driven models and scientific theories in language neuroscience
Antonello, Richard, Singh, Chandan, Jain, Shailee, Hsu, Aliyah, Gao, Jianfeng, Yu, Bin, Huth, Alexander
However, these models are not scientific theories that describe the world in natural language. Instead, they are implemented in the form of vast neural networks with millions or billions of largely inscrutable parameters. One emblematic field is language neuroscience, where large language models (LLMs) are highly effective at predicting human brain responses to natural language, but are virtually impossible to interpret or analyze by hand [4-10]. To overcome this challenge, we introduce the generative explanation-mediated validation (GEM-V) framework. GEM-V translates deep learning models of language selectivity in the brain into concise verbal explanations, and then designs follow-up experiments to verify that these explanations are causally related to brain activity.
Explaining black box text modules in natural language with language models
Singh, Chandan, Hsu, Aliyah R., Antonello, Richard, Jain, Shailee, Huth, Alexander G., Yu, Bin, Gao, Jianfeng
Large language models (LLMs) have demonstrated remarkable prediction performance for a growing array of tasks. However, their rapid proliferation and increasing opaqueness have created a growing need for interpretability. Here, we ask whether we can automatically obtain natural language explanations for black box text modules. A "text module" is any function that maps text to a scalar continuous value, such as a submodule within an LLM or a fitted model of a brain region. "Black box" indicates that we only have access to the module's inputs/outputs. We introduce Summarize and Score (SASC), a method that takes in a text module and returns a natural language explanation of the module's selectivity along with a score for how reliable the explanation is. We study SASC in 3 contexts. First, we evaluate SASC on synthetic modules and find that it often recovers ground truth explanations. Second, we use SASC to explain modules found within a pre-trained BERT model, enabling inspection of the model's internals. Finally, we show that SASC can generate explanations for the response of individual fMRI voxels to language stimuli, with potential applications to fine-grained brain mapping. All code for using SASC and reproducing results is made available on Github.
A single-layer RNN can approximate stacked and bidirectional RNNs, and topologies in between
Turek, Javier S., Jain, Shailee, Capota, Mihai, Huth, Alexander G., Willke, Theodore L.
To enhance the expressiveness and representational capacity of recurrent neural networks (RNN), a large body of work has emerged exploring stacked architectures with additional topological modifications like shortcut connections or bidirectionality. However, choosing the best network for a particular problem requires a combinatorial search over architectures and their hyperparameters. In this work, we show that a single-layer RNN can perfectly mimic an arbitrarily deep stacked RNN under specific constraints on its weight matrix and a delay between input and output. This obviates the need to manually select hyperparameters like the number of layers. Additionally, we show that weakening weight constraints while keeping the delay gives rise to partial acausality in the single-layer RNN, much like a bidirectional network. Synthetic experiments confirm that the delayed RNN can mimic bidirectional networks in perfectly solving some acausal tasks, outperforming them in others. Finally, we show that in a challenging language processing task, the delayed RNN performs within 0.3\% of the accuracy of the bidirectional network while reducing computational costs.
Incorporating Context into Language Encoding Models for fMRI
Jain, Shailee, Huth, Alexander
Language encoding models help explain language processing in the human brain by learning functions that predict brain responses from the language stimuli that elicited them. Current word embedding-based approaches treat each stimulus word independently and thus ignore the influence of context on language understanding. In this work we instead build encoding models using rich contextual representations derived from an LSTM language model. Our models show a significant improvement in encoding performance relative to state-of-the-art embeddings in nearly every brain area. By varying the amount of context used in the models and providing the models with distorted context, we show that this improvement is due to a combination of better word embeddings learned by the LSTM language model and contextual information. We are also able to use our models to map context sensitivity across the cortex. These results suggest that LSTM language models learn high-level representations that are related to representations in the human brain.
Incorporating Context into Language Encoding Models for fMRI
Jain, Shailee, Huth, Alexander
Language encoding models help explain language processing in the human brain by learning functions that predict brain responses from the language stimuli that elicited them. Current word embedding-based approaches treat each stimulus word independently and thus ignore the influence of context on language understanding. In this work, we instead build encoding models using rich contextual representations derived from an LSTM language model. Our models show a significant improvement in encoding performance relative to state-of-the-art embeddings in nearly every brain area. By varying the amount of context used in the models and providing the models with distorted context, we show that this improvement is due to a combination of better word embeddings learned by the LSTM language model and contextual information. We are also able to use our models to map context sensitivity across the cortex. These results suggest that LSTM language models learn high-level representations that are related to representations in the human brain.