A Tutorial on Clinical Speech AI Development: From Data Collection to Model Validation
Ng, Si-Ioi, Xu, Lingfeng, Siegert, Ingo, Cummins, Nicholas, Benway, Nina R., Liss, Julie, Berisha, Visar
–arXiv.org Artificial Intelligence
There has been a surge of interest in leveraging speech as a marker of health for a wide spectrum of conditions. The underlying premise is that any neurological, mental, or physical deficits that impact speech production can be objectively assessed via automated analysis of speech. Recent advances in speech-based Artificial Intelligence (AI) models for diagnosing and tracking mental health, cognitive, and motor disorders often use supervised learning, similar to mainstream speech technologies like recognition and verification. However, clinical speech AI has distinct challenges, including the need for specific elicitation tasks, small available datasets, diverse speech representations, and uncertain diagnostic labels. As a result, application of the standard supervised learning paradigm may lead to models that perform well in controlled settings but fail to generalize in real-world clinical deployments. With translation into real-world clinical scenarios in mind, this tutorial paper provides an overview of the key components required for robust development of clinical speech AI. Specifically, this paper will cover the design of speech elicitation tasks and protocols most appropriate for different clinical conditions, collection of data and verification of hardware, development and validation of speech representations designed to measure clinical constructs of interest, development of reliable and robust clinical prediction models, and ethical and participant considerations for clinical speech AI. The goal is to provide comprehensive guidance on building models whose inputs and outputs link to the more interpretable and clinically meaningful aspects of speech, that can be interrogated and clinically validated on clinical datasets, and that adhere to ethical, privacy, and security considerations by design.
arXiv.org Artificial Intelligence
Oct-28-2024
- Country:
- South America > Chile
- North America > United States
- Massachusetts (0.04)
- California (0.04)
- Wisconsin > Dane County
- Madison (0.04)
- New York > New York County
- New York City (0.04)
- Maryland > Prince George's County
- College Park (0.14)
- Arizona > Maricopa County
- Tempe (0.04)
- Europe
- United Kingdom > England
- Greater London > London (0.04)
- Germany
- Saxony-Anhalt > Magdeburg (0.04)
- Bavaria > Upper Bavaria
- Munich (0.04)
- France
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Marseille (0.04)
- Auvergne-Rhône-Alpes > Lyon
- Lyon (0.04)
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- United Kingdom > England
- Asia > China
- Hong Kong (0.04)
- Genre:
- Instructional Material > Course Syllabus & Notes (0.99)
- Overview (0.86)
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Education (1.00)
- Health & Medicine
- Pharmaceuticals & Biotechnology (1.00)
- Health Care Technology (1.00)
- Diagnostic Medicine (1.00)
- Consumer Health (1.00)
- Therapeutic Area
- Pulmonary/Respiratory Diseases (1.00)
- Psychiatry/Psychology (1.00)
- Oncology (1.00)
- Musculoskeletal (1.00)
- Neurology > Parkinson's Disease (0.93)
- Technology:
- Information Technology
- Data Science > Data Mining (1.00)
- Artificial Intelligence
- Speech > Speech Recognition (1.00)
- Representation & Reasoning (1.00)
- Cognitive Science (1.00)
- Natural Language
- Large Language Model (1.00)
- Text Processing (0.92)
- Chatbot (0.68)
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Statistical Learning > Regression (0.67)
- Information Technology