cognition
Embodied Cognition Augmented End2End Autonomous Driving
In recent years, vision-based end-to-end autonomous driving has emerged as a new paradigm. However, popular end-to-end approaches typically rely on visual feature extraction networks trained under label supervision. This limited supervision framework restricts the generality and applicability of driving models. In this paper, we propose a novel paradigm termed E3AD, which advocates for comparative learning between visual feature extraction networks and the general EEG large model, in order to learn latent human driving cognition for enhancing end-to-end planning. In this work, we collected a cognitive dataset for the mentioned contrastive learning process. Subsequently, we investigated the methods and potential mechanisms for enhancing end-to-end planning with human driving cognition, using popular driving models as baselines on publicly available autonomous driving datasets. Both open-loop and closed-loop tests are conducted for a comprehensive evaluation of planning performance. Experimental results demonstrate that the E3AD paradigm significantly enhances the end-to-end planning performance of baseline models.
Statistical or embodied? Comparing people and LLMs in their processing of color metaphors: an interview with Douglas Guilbeault
We sat down with Douglas Guillbault to discuss his paper, " Comparing Colorseeing, Colorblind, Painters, and Large Language Models in Their Processing of Color Metaphors ". The results have interesting implications for how we model human cognition, and in turn, how the concept of synaesthesia could be integrated to develop more intelligent AI models. A color metaphor is the use of color to describe something in a way that is not immediately literal. For example, to say "green with envy" would be a color metaphor, because envy doesn't have an immediate visual structure to it - we're evoking a broader, more flexible notion of what green conveys, beyond just its visible properties. What makes metaphors very interesting is that they often use past experience or cultural associations in new ways to talk about something beyond our current perception - either something imagined or in the future, which are many steps of abstraction away from the present. Metaphors provide an alternative pathway to get there.
BrainMoE: Cognition Joint Embedding via Mixture-of-Expert Towards Robust Brain Foundation Model
Given the large scale of public functional Magnetic Resonance Imaging (fMRI), e.g., UK Biobank (UKB) and Human Connectome Projects (HCP), brain foundation models are emerging. Although the amount of samples under rich environmental variables is unprecedented, existing brain foundation models learn from fMRI derived from a narrow range of cognitive states stimulated by similar environments, causing the limited robustness demonstrated in various applications and datasets acquired with different pipelines and limited sample size. By capitalizing on the variety of cognitive status as subjects performing explicit tasks, we present the mixture of brain experts, namely BrainMoE, pre-training on tasking fMRI with rich behavioral tasks in addition to resting fMRI for a robust brain foundation model. Brain experts are designed to produce embeddings for different behavioral tasks related to cognition. Afterward, these cognition embeddings are mixed by a cognition adapter via cross-attention so that BrainMoE can handle orthogonal embeddings and be robust on those boutique downstream datasets. We have pre-trained two existing self-regressive architectures and one new supervised architecture as brain experts on 68,251 fMRI scans among UKB and HCP, containing 12 different cognitive states. Then, BrainMoE is evaluated on a variety of applications, including sex, age prediction, human behavior recognition, disease early diagnosis of Autism, Parkinson's disease, Alzheimer's disease, and Schizophrenia, and fMRI-EEG multimodal applications, where promising results in eight datasets from three different pipelines indicate great potential to facilitate current neuroimaging applications in clinical routines.
or Sound Symbolism in Vision and Language Models
Although the mapping between sound and meaning in human language is assumed to be largely arbitrary, research in cognitive science has shown that there are non-trivial correlations between particular sounds and meanings across languages and demographic groups, a phenomenon known as sound symbolism. Among the many dimensions of meaning, sound symbolism is particularly salient and welldemonstrated with regards to cross-modal associations between language and the visual domain. In this work, we address the question of whether sound symbolism is reflected in vision-and-language models such as CLIP and Stable Diffusion. Using zero-shot knowledge probing to investigate the inherent knowledge of these models, we find strong evidence that they do show this pattern, paralleling the well-known kiki-bouba effect in psycholinguistics. Our work provides a novel method for demonstrating sound symbolism and understanding its nature using computational tools. Our code will be made publicly available1.
A history of RoboCup with Manuela Veloso
RoboCup is an international competition that promotes and advances robotics and AI through the challenges presented by its various leagues. We got the chance to sit down with Professor Manuela Veloso, one of RoboCup's founders, to find out more about how it all started, how the community has grown over the years, and the vision for the future. I think it would be very interesting to go right back to the beginning and hear how RoboCup got started. What was the initial idea, and how did it get set up? So we are talking about the mid-90s. In terms of the research in those days, it was the beginning of the internet and many AI and computer science researchers were focused on the internet, first on sophisticated search algorithms, on natural language understanding, on information retrieval, and then on software agents and machine learning applied to digital information. From what I recall, there was a smaller group of researchers who were interested in actual, physical robots, and in particular in AI and robotics.
Probabilistic Joint and Individual Variation Explained (ProJIVE) for Data Integration
Murden, Raphiel J., Tian, Ganzhong, Qiu, Deqiang, Risk, Benajmin B.
Collecting multiple types of data on the same set of subjects is common in modern scientific applications including, genomics, metabolomics, and neuroimaging. Joint and Individual Variance Explained (JIVE) seeks a low-rank approximation of the joint variation between two or more sets of features captured on common subjects and isolates this variation from that unique to eachset of features. We develop an expectation-maximization (EM) algorithm to estimate a probabilistic model for the JIVE framework. The model extends probabilistic principal components analysis to multiple data sets. Our maximum likelihood approach simultaneously estimates joint and individual components, which can lead to greater accuracy compared to other methods. We apply ProJIVE to measures of brain morphometry and cognition in Alzheimer's disease. ProJIVE learns biologically meaningful courses of variation, and the joint morphometry and cognition subject scores are strongly related to more expensive existing biomarkers. Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. Code to reproduce the analysis is available on our GitHub page.