Goto

Collaborating Authors

 Mirheidari, Bahman


Exploring Gender Disparities in Automatic Speech Recognition Technology

arXiv.org Artificial Intelligence

This study investigates factors influencing Automatic Speech Recognition (ASR) systems' fairness and performance across genders, beyond the conventional examination of demographics. Using the LibriSpeech dataset and the Whisper small model, we analyze how performance varies across different gender representations in training data. Our findings suggest a complex interplay between the gender ratio in training data and ASR performance. Optimal fairness occurs at specific gender distributions rather than a simple 50-50 split. Furthermore, our findings suggest that factors like pitch variability can significantly affect ASR accuracy. This research contributes to a deeper understanding of biases in ASR systems, highlighting the importance of carefully curated training data in mitigating gender bias.


CognoSpeak: an automatic, remote assessment of early cognitive decline in real-world conversational speech

arXiv.org Artificial Intelligence

The early signs of cognitive decline are often noticeable in conversational speech, and identifying those signs is crucial in dealing with later and more serious stages of neurodegenerative diseases. Clinical detection is costly and time-consuming and although there has been recent progress in the automatic detection of speech-based cues, those systems are trained on relatively small databases, lacking detailed metadata and demographic information. This paper presents CognoSpeak and its associated data collection efforts. CognoSpeak asks memory-probing long and short-term questions and administers standard cognitive tasks such as verbal and semantic fluency and picture description using a virtual agent on a mobile or web platform. In addition, it collects multimodal data such as audio and video along with a rich set of metadata from primary and secondary care, memory clinics and remote settings like people's homes. Here, we present results from 126 subjects whose audio was manually transcribed. Several classic classifiers, as well as large language model-based classifiers, have been investigated and evaluated across the different types of prompts. We demonstrate a high level of performance; in particular, we achieved an F1-score of 0.873 using a DistilBERT model to discriminate people with cognitive impairment (dementia and people with mild cognitive impairment (MCI)) from healthy volunteers using the memory responses, fluency tasks and cookie theft picture description. CognoSpeak is an automatic, remote, low-cost, repeatable, non-invasive and less stressful alternative to existing clinical cognitive assessments.


Early Dementia Detection Using Multiple Spontaneous Speech Prompts: The PROCESS Challenge

arXiv.org Artificial Intelligence

Second, the audio quality of the data is poor and does not represent the quality that it is possible to Dementia is associated with various cognitive impairments achieve even with current, standard consumer-based devices and typically manifests only after significant progression, like modern laptops. These factors underscore the necessity making intervention at this stage often ineffective. To address for new data sets to ensure the continued advancement and this issue, the Prediction and Recognition of Cognitive accuracy of research in this field. Decline through Spontaneous Speech (PROCESS) Signal The PROCESS Signal Processing Grand Challenge aims Processing Grand Challenge invites participants to focus on to establish a platform for contributions and discussions on early-stage dementia detection. We provide a new spontaneous early-stage dementia detection using speech signal processing speech corpus for this challenge. This corpus includes and Artificial Intelligence (AI) models. To support this, answers from three prompts designed by neurologists to better we provide a state-of-the-art corpus covering a broader range capture the cognition of speakers. Our baseline models of diagnostic classes for different subtypes of early-stage achieved an F1-score of 55.0% on the classification task and dementia, including mild cognitive impairment (MCI).