incorporating context
Incorporating Context into Language Encoding Models for fMRI
Language encoding models help explain language processing in the human brain by learning functions that predict brain responses from the language stimuli that elicited them. Current word embedding-based approaches treat each stimulus word independently and thus ignore the influence of context on language understanding. In this work we instead build encoding models using rich contextual representations derived from an LSTM language model. Our models show a significant improvement in encoding performance relative to state-of-the-art embeddings in nearly every brain area. By varying the amount of context used in the models and providing the models with distorted context, we show that this improvement is due to a combination of better word embeddings learned by the LSTM language model and contextual information. We are also able to use our models to map context sensitivity across the cortex. These results suggest that LSTM language models learn high-level representations that are related to representations in the human brain.
DiVR: incorporating context from diverse VR scenes for human trajectory prediction
Gallo, Franz Franco, Wu, Hui-Yin, Sassatelli, Lucile
Virtual environments provide a rich and controlled setting for collecting detailed data on human behavior, offering unique opportunities for predicting human trajectories in dynamic scenes. However, most existing approaches have overlooked the potential of these environments, focusing instead on static contexts without considering userspecific factors. Employing the CREATTIVE3D dataset, our work models trajectories recorded in virtual reality (VR) scenes for diverse situations including road-crossing tasks with user interactions and simulated visual impairments. We propose Diverse Context VR Human Motion Prediction (DiVR), a cross-modal transformer based on the Perceiver architecture that integrates both static and dynamic scene context using a heterogeneous graph convolution network. We conduct extensive experiments comparing DiVR against existing architectures including MLP, LSTM, and transformers with gaze and point cloud context. Additionally, we also stress test our model's generalizability across different users, tasks, and scenes. Results show that DiVR achieves higher accuracy and adaptability compared to other models and to static graphs. This work highlights the advantages of using VR datasets for context-aware human trajectory modeling, with potential applications in enhancing user experiences in the metaverse.
- Europe > France > Provence-Alpes-Côte d'Azur (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Transportation (0.49)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.48)
- Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Reviews: Incorporating Context into Language Encoding Models for fMRI
This paper compares the embedding of a 3-layer LSTM to the neural responses of people listening to podcasts recorded via fMRI. The experiments vary the number of layers in the LSTM, and then context available to the LSTM and compare it to a context-free word embedding model. This is a strong paper, well written and clear. The results are thorough and there are a few interesting surprises. I have a few questions of clarification. 1) How do the authors account for the differences in number of words per TR due to differing word length and prosody?
Incorporating Context into Language Encoding Models for fMRI
Jain, Shailee, Huth, Alexander
Language encoding models help explain language processing in the human brain by learning functions that predict brain responses from the language stimuli that elicited them. Current word embedding-based approaches treat each stimulus word independently and thus ignore the influence of context on language understanding. In this work we instead build encoding models using rich contextual representations derived from an LSTM language model. Our models show a significant improvement in encoding performance relative to state-of-the-art embeddings in nearly every brain area. By varying the amount of context used in the models and providing the models with distorted context, we show that this improvement is due to a combination of better word embeddings learned by the LSTM language model and contextual information.