Goto

Collaborating Authors

 brain state


Transformers for Multimodal Brain State Decoding: Integrating Functional Magnetic Resonance Imaging Data and Medical Metadata

Jazi, Danial Jafarzadeh, Hajiesmaeili, Maryam

arXiv.org Artificial Intelligence

Abstract--Decoding brain states from functional magnetic resonance imaging (fMRI) data is vital for advancing neuroscience and clinical applications. While traditional machine learning and deep learning approaches have made strides in leveraging the high-dimensional and complex nature of fMRI data, they often fail to utilize the contextual richness provided by Digital Imaging and Communications in Medicine (DICOM) metadata. This paper presents a novel framework integrating transformer-based architectures with multimodal inputs, including fMRI data and DICOM metadata. By employing attention mechanisms, the proposed method captures intricate spatial-temporal patterns and contextual relationships, enhancing model accuracy, inter-pretability, and robustness. The potential of this framework spans applications in clinical diagnostics, cognitive neuroscience, and personalized medicine. Limitations, such as metadata variability and computational demands, are addressed, and future directions for optimizing scalability and generalizability are discussed.


Dynamic Functional Connectivity Features for Brain State Classification: Insights from the Human Connectome Project

Kirova, Valeriya, Kadieva, Dzerassa, Vlasenko, Daniil, Blank, Isak B., Ratnikov, Fedor

arXiv.org Artificial Intelligence

Abstract--We analyze functional magnetic resonance imaging (fMRI) data from the Human Connectome Project (HCP) to match brain activities during a range of cognitive tasks. Our findings demonstrate that even basic linear machine learning models can effectively classify brain states and achieve state-of-the-art accuracy, particularly for tasks related to motor functions and language processing. Feature importance ranking allows to identify distinct sets of brain regions whose activation patterns are uniquely associated with specific cognitive functions. These discriminative features provide strong support for the hypothesis of functional specialization across cortical and subcortical areas of the human brain. Additionally, we investigate the temporal dynamics of the identified brain regions, demonstrating that the time-dependent structure of fMRI signals are essential for shaping functional connectivity between regions: uncorrelated areas are least important for classification. This temporal perspective provides deeper insights into the formation and modulation of brain neural networks involved in cognitive processing. Modern neuroimaging techniques, such as fMRI, enable the investigation of brain activity in real time, opening new avenues for studying cognitive processes. However, the analysis of fMRI data represents a complex challenge due to its high-dimensional and dynamic nature.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The manuscript describes a very interesting model for the analysis of brain states for multi-region LFP time-series. The time-series are separated in different time-windows. An infinite mixture of Gaussian Processes is considered to model the observations in each window. Brain states are assigned to each observation by means of an underlying HDP and brain regions are assigned to clusters by means of a HDP.


Analysis of Brain States from Multi-Region LFP Time-Series

Kyle R. Ulrich, David E. Carlson, Wenzhao Lian, Jana S. Borg, Kafui Dzirasa, Lawrence Carin

Neural Information Processing Systems

The local field potential (LFP) is a source of information about the broad patterns of brain activity, and the frequencies present in these time-series measurements are often highly correlated between regions. It is believed that these regions may jointly constitute a "brain state," relating to cognition and behavior. An infinite hidden Markov model (iHMM) is proposed to model the evolution of brain states, based on electrophysiological LFP data measured at multiple brain regions. A brain state influences the spectral content of each region in the measured LFP . A new state-dependent tensor factorization is employed across brain regions, and the spectral properties of the LFPs are characterized in terms of Gaussian processes (GPs). The LFPs are modeled as a mixture of GPs, with state-and region-dependent mixture weights, and with the spectral content of the data encoded in GP spectral mixture covariance kernels. The model is able to estimate the number of brain states and the number of mixture components in the mixture of GPs. A new variational Bayesian split-merge algorithm is employed for inference. The model infers state changes as a function of external covariates in two novel elec-trophysiological datasets, using LFP data recorded simultaneously from multiple brain regions in mice; the results are validated and interpreted by subject-matter experts.


GP Kernels for Cross-Spectrum Analysis

Kyle R. Ulrich, David E. Carlson, Kafui Dzirasa, Lawrence Carin

Neural Information Processing Systems

Multi-output Gaussian processes provide a convenient framework for multi-task problems. An illustrative and motivating example of a multi-task problem is multi-region electrophysiological time-series data, where experimentalists are interested in both power and phase coherence between channels. Recently, Wilson and Adams (2013) proposed the spectral mixture (SM) kernel to model the spectral density of a single task in a Gaussian process framework. In this paper, we develop a novel covariance kernel for multiple outputs, called the cross-spectral mixture (CSM) kernel. This new, flexible kernel represents both the power and phase relationship between multiple observation channels. We demonstrate the expressive capabilities of the CSM kernel through implementation of a Bayesian hidden Markov model, where the emission distribution is a multi-output Gaussian process with a CSM covariance kernel. Results are presented for measured multi-region electrophysiological data.


WaveMind: Towards a Conversational EEG Foundation Model Aligned to Textual and Visual Modalities

Zeng, Ziyi, Cai, Zhenyang, Cai, Yixi, Wang, Xidong, Chen, Junying, Wang, Rongsheng, Liu, Yipeng, Cai, Siqi, Wang, Benyou, Zhang, Zhiguo, Li, Haizhou

arXiv.org Artificial Intelligence

Electroencephalography (EEG) interpretation using multimodal large language models (MLLMs) offers a novel approach for analyzing brain signals. However, the complex nature of brain activity introduces critical challenges: EEG signals simultaneously encode both cognitive processes and intrinsic neural states, creating a mismatch in EEG paired-data modality that hinders effective cross-modal representation learning. Through a pivot investigation, we uncover complementary relationships between these modalities. Leveraging this insight, we propose mapping EEG signals and their corresponding modalities into a unified semantic space to achieve generalized interpretation. To fully enable conversational capabilities, we further introduce WaveMind-Instruct-338k, the first cross-task EEG dataset for instruction tuning. The resulting model demonstrates robust classification accuracy while supporting flexible, open-ended conversations across four downstream tasks, thereby offering valuable insights for both neuroscience research and the development of general-purpose EEG models.


Analysis of Brain States from Multi-Region LFP Time-Series

Neural Information Processing Systems

The local field potential (LFP) is a source of information about the broad patterns of brain activity, and the frequencies present in these time-series measurements are often highly correlated between regions. It is believed that these regions may jointly constitute a ``brain state,'' relating to cognition and behavior. An infinite hidden Markov model (iHMM) is proposed to model the evolution of brain states, based on electrophysiological LFP data measured at multiple brain regions. A brain state influences the spectral content of each region in the measured LFP. A new state-dependent tensor factorization is employed across brain regions, and the spectral properties of the LFPs are characterized in terms of Gaussian processes (GPs). The LFPs are modeled as a mixture of GPs, with state-and region-dependent mixture weights, and with the spectral content of the data encoded in GP spectral mixture covariance kernels. The model is able to estimate the number of brain states and the number of mixture components in the mixture of GPs. A new variational Bayesian split-merge algorithm is employed for inference. The model infers state changes as a function of external covariates in two novel electrophysiological datasets, using LFP data recorded simultaneously from multiple brain regions in mice; the results are validated and interpreted by subject-matter experts.


Ensemble-Based Graph Representation of fMRI Data for Cognitive Brain State Classification

Vlasenko, Daniil, Ushakov, Vadim, Zaikin, Alexey, Zakharov, Denis

arXiv.org Artificial Intelligence

Understanding and classifying human cognitive brain states based on neuroimaging data remains one of the foremost and most challenging problems in neuroscience, owing to the high dimensionality and intrinsic noise of the signals. In this work, we propose an ensemble-based graph representation method of functional magnetic resonance imaging (fMRI) data for the task of binary brain-state classification. Our method builds the graph by leveraging multiple base machine-learning models: each edge weight reflects the difference in posterior probabilities between two cognitive states, yielding values in the range [-1, 1] that encode confidence in a given state. We applied this approach to seven cognitive tasks from the Human Connectome Project (HCP 1200 Subject Release), including working memory, gambling, motor activity, language, social cognition, relational processing, and emotion processing. Using only the mean incident edge weights of the graphs as features, a simple logistic-regression classifier achieved average accuracies from 97.07% to 99.74%. We also compared our ensemble graphs with classical correlation-based graphs in a classification task with a graph neural network (GNN). In all experiments, the highest classification accuracy was obtained with ensemble graphs. These results demonstrate that ensemble graphs convey richer topological information and enhance brain-state discrimination. Our approach preserves edge-level interpretability of the fMRI graph representation, is adaptable to multiclass and regression tasks, and can be extended to other neuroimaging modalities and pathological-state classification.


FuzzCoh: Robust Canonical Coherence-Based Fuzzy Clustering of Multivariate Time Series

Ma, Ziling, Talento, Mara Sherlin, Sun, Ying, Ombao, Hernando

arXiv.org Machine Learning

Brain cognitive and sensory functions are often associated with electrophysiological activity at specific frequency bands. Clustering multivariate time series (MTS) data like EEGs is important for understanding brain functions but challenging due to complex non-stationary cross-dependencies, gradual transitions between cognitive states, noisy measurements, and ambiguous cluster boundaries. To address these issues, we develop a robust fuzzy clustering framework in the spectral domain. Our method leverages Kendall's tau-based canonical coherence, which extracts meaningful frequency-specific monotonic relationships between groups of channels or regions. KenCoh effectively captures dominant coherence structures while remaining robust against outliers and noise, making it suitable for real EEG datasets that typically contain artifacts. Our method first projects each MTS object onto vectors derived from the KenCoh estimates (i.e, canonical directions), which capture relevant information on the connectivity structure of oscillatory signals in predefined frequency bands. These spectral features are utilized to determine clusters of epochs using a fuzzy partitioning strategy, accommodating gradual transitions and overlapping class structure. Lastly, we demonstrate the effectiveness of our approach to EEG data where latent cognitive states such as alertness and drowsiness exhibit frequency-specific dynamics and ambiguity. Our method captures both spectral and spatial features by locating the frequency-dependent structure and brain functional connectivity. Built on the KenCoh framework for fuzzy clustering, it handles the complexity of high-dimensional time series data and is broadly applicable to domains such as neuroscience, wearable sensing, environmental monitoring, and finance.


Voxel-Level Brain States Prediction Using Swin Transformer

Sun, Yifei, Chahine, Daniel, Wen, Qinghao, Liu, Tianming, Li, Xiang, Yuan, Yixuan, Calamante, Fernando, Lv, Jinglei

arXiv.org Artificial Intelligence

Understanding brain dynamics is important for neuroscience and mental health. Functional magnetic resonance imaging (fMRI) enables the measurement of neural activities through blood-oxygen-level-dependent (BOLD) signals, which represent brain states. In this study, we aim to predict future human resting brain states with fMRI. Due to the 3D voxel-wise spatial organization and temporal dependencies of the fMRI data, we propose a novel architecture which employs a 4D Shifted Window (Swin) Transformer as encoder to efficiently learn spatio-temporal information and a convolutional decoder to enable brain state prediction at the same spatial and temporal resolution as the input fMRI data. We used 100 unrelated subjects from the Human Connectome Project (HCP) for model training and testing. Our novel model has shown high accuracy when predicting 7.2s resting-state brain activities based on the prior 23.04s fMRI time series. The predicted brain states highly resemble BOLD contrast and dynamics. This work shows promising evidence that the spatiotemporal organization of the human brain can be learned by a Swin Transformer model, at high resolution, which provides a potential for reducing the fMRI scan time and the development of brain-computer interfaces in the future.