Framework for Curating Speech Datasets and Evaluating ASR Systems: A Case Study for Polish

Jul-18-2024–arXiv.org Artificial Intelligence

Speech datasets available in the public domain are often underutilized because of challenges in discoverability and interoperability. A comprehensive framework has been designed to survey, catalog, and curate available speech datasets, which allows replicable evaluation of automatic speech recognition (ASR) systems. A case study focused on the Polish language was conducted; the framework was applied to curate more than 24 datasets and evaluate 25 combinations of ASR systems and models. This research constitutes the most extensive comparison to date of both commercial and free ASR systems for the Polish language. It draws insights from 600 system-model-test set evaluations, marking a significant advancement in both scale and comprehensiveness.

arxiv preprint, asr system, dataset, (13 more...)

arXiv.org Artificial Intelligence

Jul-18-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Europe
  - France (0.04)
  - Poland
    - Greater Poland Province > Poznań (0.04)
    - Łódź Province > Łódź (0.04)
  - Germany > Bavaria
    - Upper Bavaria > Munich (0.04)
- Asia > Japan
  - Kyūshū & Okinawa > Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)

Genre:
- Research Report (0.82)
- Overview (0.68)

Technology:
- Information Technology
  - Communications (0.94)
  - Information Management (0.93)
  - Artificial Intelligence
    - Speech > Speech Recognition (1.00)
    - Natural Language (1.00)
    - Machine Learning (1.00)