Goto

Collaborating Authors

 hc group




Linguistic trajectories of bipolar disorder on social media

arXiv.org Artificial Intelligence

Correspondence should be addressed to: Laurin Plank. This paper has not yet been peer - reviewed Abstract Language provides valuable markers of affective disorders such as bipolar disorder (BD), yet clinical assessments remain limited in scale. In response, analyses of social media (SM) language have gained prominence due to their high temporal resolution and longitudinal scope. Here, we introduce a method to determine the timing of users' diagnoses and apply it to study language trajectories from 3 years before to 21 years after BD diagnosis - contrasted with uses reporting unipolar depression (UD) and non - aff ected users (HC). We show that BD diagnosis is accompanied by pervasive linguistic alterations reflecting mood disturbance, psychiatric comorbidity, substance abuse, hospitalization, medical comorbidities, unusual thought content, and disorganized thought. W e further observe recurring mood - related language change s across two decades after the diagnosis, with a pronounced 12 - month periodicity suggestive of seasonal mood episodes. Finally, trend - level evidence suggests an increased periodicity in users estima ted to be female. In sum, our findings provide evidence for language alterations in the acute and chronic phase of BD. Th i s validates and extends recent efforts leveraging SM for scalable monitoring of mental health. Knowledge of diagnosis events allows language alterations to be contextualized with respect to the current disorder phase . For example, it would allow comparing language change from a premorbid to the acute disorder phase, or to study long - term behavioral patterns in the chronic disorder phase . W e then use the resulting digital clinical cohorts (DICCs) to study longitudinal language trajectories in users who self - disclose having been diagnosed with BD. This time information is then passed to SUTime, a temporal parsing algorithm, which yielded normalized datetime information. T hese data are additionally filtered through a rule - based algorithm to exclude non - viable datetimes (e.g., those including seasonal information such as "spring, 2022"). Pseudo - diagnoses are assigned to a group of regular Reddit users who served as a healthy control group (HC). Fig . 1 gives an overview of the DICC s pipeline.


Explainable Brain Age Gap Prediction in Neurodegenerative Conditions using coVariance Neural Networks

arXiv.org Artificial Intelligence

Brain age is the estimate of biological age derived from neuroimaging datasets using machine learning algorithms. Increasing \textit{brain age gap} characterized by an elevated brain age relative to the chronological age can reflect increased vulnerability to neurodegeneration and cognitive decline. Hence, brain age gap is a promising biomarker for monitoring brain health. However, black-box machine learning approaches to brain age gap prediction have limited practical utility. Recent studies on coVariance neural networks (VNN) have proposed a relatively transparent deep learning pipeline for neuroimaging data analyses, which possesses two key features: (i) inherent \textit{anatomically interpretablity} of derived biomarkers; and (ii) a methodologically interpretable perspective based on \textit{linkage with eigenvectors of anatomic covariance matrix}. In this paper, we apply the VNN-based approach to study brain age gap using cortical thickness features for various prevalent neurodegenerative conditions. Our results reveal distinct anatomic patterns for brain age gap in Alzheimer's disease, frontotemporal dementia, and atypical Parkinsonian disorders. Furthermore, we demonstrate that the distinct anatomic patterns of brain age gap are linked with the differences in how VNN leverages the eigenspectrum of the anatomic covariance matrix, thus lending explainability to the reported results.


Comprehensive Methodology for Sample Augmentation in EEG Biomarker Studies for Alzheimers Risk Classification

arXiv.org Artificial Intelligence

Background: Dementia, characterized by progressive cognitive decline, is a major global health challenge. Alzheimer's disease (AD) is the predominant type, accounting for approximately 70% of dementia cases worldwide. Electroencephalography (EEG)-derived measures have shown potential in identifying AD risk, but obtaining sufficiently large samples for reliable comparisons remains a challenge. Objective: This study implements a comprehensive methodology that integrates signal processing, data harmonization, and statistical techniques to increase sample size and improve the reliability of Alzheimer's disease risk classification models. Methods: We used a multi-step approach combining advanced EEG preprocessing, feature extraction, harmonization techniques, and propensity score matching (PSM) to optimize the balance between healthy non-carriers (HC) and asymptomatic E280A mutation Alzheimer's disease carriers (ACr). Data were harmonized across four databases, adjusting for site effects while preserving important covariate effects such as age and sex. PSM was applied at different ratios (2:1, 5:1, and 10:1) to explore the impact of sample size differences on model performance. The final dataset was subjected to machine learning analysis using decision trees, with cross-validation to ensure robust model performance.


NeuroVoz: a Castillian Spanish corpus of parkinsonian speech

arXiv.org Artificial Intelligence

The advancement of Parkinson's Disease (PD) diagnosis through speech analysis is hindered by a notable lack of publicly available, diverse language datasets, limiting the reproducibility and further exploration of existing research. In response to this gap, we introduce a comprehensive corpus from 108 native Castilian Spanish speakers, comprising 55 healthy controls and 53 individuals diagnosed with PD, all of whom were under pharmacological treatment and recorded in their medication-optimized state. This unique dataset features a wide array of speech tasks, including sustained phonation of the five Spanish vowels, diadochokinetic tests, 16 listen-and-repeat utterances, and free monologues. The dataset emphasizes accuracy and reliability through specialist manual transcriptions of the listen-and-repeat tasks and utilizes Whisper for automated monologue transcriptions, making it the most complete public corpus of Parkinsonian speech, and the first in Castillian Spanish. NeuroVoz is composed by 2,903 audio recordings averaging $26.88 \pm 3.35$ recordings per participant, offering a substantial resource for the scientific exploration of PD's impact on speech. This dataset has already underpinned several studies, achieving a benchmark accuracy of 89% in PD speech pattern identification, indicating marked speech alterations attributable to PD. Despite these advances, the broader challenge of conducting a language-agnostic, cross-corpora analysis of Parkinsonian speech patterns remains an open area for future research. This contribution not only fills a critical void in PD speech analysis resources but also sets a new standard for the global research community in leveraging speech as a diagnostic tool for neurodegenerative diseases.


Towards a Foundation Model for Brain Age Prediction using coVariance Neural Networks

arXiv.org Artificial Intelligence

Brain age is the estimate of biological age derived from neuroimaging datasets using machine learning algorithms. Increasing brain age with respect to chronological age can reflect increased vulnerability to neurodegeneration and cognitive decline. In this paper, we study NeuroVNN, based on coVariance neural networks, as a paradigm for foundation model for the brain age prediction application. NeuroVNN is pre-trained as a regression model on healthy population to predict chronological age using cortical thickness features and fine-tuned to estimate brain age in different neurological contexts. Importantly, NeuroVNN adds anatomical interpretability to brain age and has a `scale-free' characteristic that allows its transference to datasets curated according to any arbitrary brain atlas. Our results demonstrate that NeuroVNN can extract biologically plausible brain age estimates in different populations, as well as transfer successfully to datasets of dimensionalities distinct from that for the dataset used to train NeuroVNN.


Explainable Brain Age Prediction using coVariance Neural Networks

arXiv.org Artificial Intelligence

In computational neuroscience, there has been an increased interest in developing machine learning algorithms that leverage brain imaging data to provide estimates of "brain age" for an individual. Importantly, the discordance between brain age and chronological age (referred to as "brain age gap") can capture accelerated aging due to adverse health conditions and therefore, can reflect increased vulnerability towards neurological disease or cognitive impairments. However, widespread adoption of brain age for clinical decision support has been hindered due to lack of transparency and methodological justifications in most existing brain age prediction algorithms. In this paper, we leverage coVariance neural networks (VNN) to propose an explanation-driven and anatomically interpretable framework for brain age prediction using cortical thickness features. Specifically, our brain age prediction framework extends beyond the coarse metric of brain age gap in Alzheimer's disease (AD) and we make two important observations: (i) VNNs can assign anatomical interpretability to elevated brain age gap in AD by identifying contributing brain regions, (ii) the interpretability offered by VNNs is contingent on their ability to exploit specific eigenvectors of the anatomical covariance matrix. Together, these observations facilitate an explainable and anatomically interpretable perspective to the task of brain age prediction.


Transferability of coVariance Neural Networks and Application to Interpretable Brain Age Prediction using Anatomical Features

arXiv.org Artificial Intelligence

Graph convolutional networks (GCN) leverage topology-driven graph convolutional operations to combine information across the graph for inference tasks. In our recent work, we have studied GCNs with covariance matrices as graphs in the form of coVariance neural networks (VNNs) that draw similarities with traditional PCA-driven data analysis approaches while offering significant advantages over them. In this paper, we first focus on theoretically characterizing the transferability of VNNs. The notion of transferability is motivated from the intuitive expectation that learning models could generalize to "compatible" datasets (possibly of different dimensionalities) with minimal effort. VNNs inherit the scale-free data processing architecture from GCNs and here, we show that VNNs exhibit transferability of performance over datasets whose covariance matrices converge to a limit object. Multi-scale neuroimaging datasets enable the study of the brain at multiple scales and hence, can validate the theoretical results on the transferability of VNNs. To gauge the advantages offered by VNNs in neuroimaging data analysis, we focus on the task of "brain age" prediction using cortical thickness features. In clinical neuroscience, there has been an increased interest in machine learning algorithms which provide estimates of "brain age" that deviate from chronological age. We leverage the architecture of VNNs to extend beyond the coarse metric of brain age gap in Alzheimer's disease (AD) and make two important observations: (i) VNNs can assign anatomical interpretability to elevated brain age gap in AD, and (ii) the interpretability offered by VNNs is contingent on their ability to exploit specific principal components of the anatomical covariance matrix. We further leverage the transferability of VNNs to cross validate the above observations across different datasets.