WaveMind: Towards a Conversational EEG Foundation Model Aligned to Textual and Visual Modalities
Zeng, Ziyi, Cai, Zhenyang, Cai, Yixi, Wang, Xidong, Chen, Junying, Wang, Rongsheng, Liu, Yipeng, Cai, Siqi, Wang, Benyou, Zhang, Zhiguo, Li, Haizhou
–arXiv.org Artificial Intelligence
Electroencephalography (EEG) interpretation using multimodal large language models (MLLMs) offers a novel approach for analyzing brain signals. However, the complex nature of brain activity introduces critical challenges: EEG signals simultaneously encode both cognitive processes and intrinsic neural states, creating a mismatch in EEG paired-data modality that hinders effective cross-modal representation learning. Through a pivot investigation, we uncover complementary relationships between these modalities. Leveraging this insight, we propose mapping EEG signals and their corresponding modalities into a unified semantic space to achieve generalized interpretation. To fully enable conversational capabilities, we further introduce WaveMind-Instruct-338k, the first cross-task EEG dataset for instruction tuning. The resulting model demonstrates robust classification accuracy while supporting flexible, open-ended conversations across four downstream tasks, thereby offering valuable insights for both neuroscience research and the development of general-purpose EEG models.
arXiv.org Artificial Intelligence
Oct-2-2025
- Country:
- Asia > China
- Guangdong Province > Shenzhen (0.04)
- Heilongjiang Province > Harbin (0.04)
- Hong Kong (0.04)
- Asia > China
- Genre:
- Research Report > New Finding (0.93)
- Industry:
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Technology:
- Information Technology > Artificial Intelligence
- Cognitive Science (1.00)
- Machine Learning
- Neural Networks > Deep Learning (0.68)
- Performance Analysis > Accuracy (0.48)
- Natural Language > Large Language Model (1.00)
- Representation & Reasoning (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence