Evaluating the Usefulness of Non-Diagnostic Speech Data for Developing Parkinson's Disease Classifiers
Zhong, Terry Yi, Janse, Esther, Tejedor-Garcia, Cristian, Bosch, Louis ten, Larson, Martha
–arXiv.org Artificial Intelligence
Speech-based Parkinson's disease (PD) detection has gained attention for its automated, cost-effective, and non-intrusive nature. As research studies usually rely on data from diagnostic-oriented speech tasks, this work explores the feasibility of diagnosing PD on the basis of speech data not originally intended for diagnostic purposes, using the Turn-Taking (TT) dataset. Our findings indicate that TT can be as useful as diagnostic-oriented PD datasets like PC-GIT A. We also investigate which specific dataset characteristics impact PD classification performance. The results show that concatenating audio recordings and balancing participants' gender and status distributions can be beneficial. Cross-dataset evaluation reveals that models trained on PC-GIT A generalize poorly to TT, whereas models trained on TT perform better on PC-GIT A. Furthermore, we provide insights into the high variability across folds, which is mainly due to large differences in individual speaker performance.
arXiv.org Artificial Intelligence
Aug-29-2025
- Country:
- Europe > Netherlands (0.04)
- North America > Cuba
- La Habana Province > Havana (0.04)
- Genre:
- Research Report
- Experimental Study > Negative Result (0.46)
- New Finding (1.00)
- Research Report
- Industry:
- Health & Medicine > Therapeutic Area
- Musculoskeletal (1.00)
- Neurology > Parkinson's Disease (1.00)
- Health & Medicine > Therapeutic Area
- Technology: