ArVoice: A Multi-Speaker Dataset for Arabic Speech Synthesis

Toyin, Hawau Olamide, Marew, Rufael, Alblooshi, Humaid, Magdy, Samar M., Aldarmaki, Hanan

May-28-2025–arXiv.org Artificial Intelligence

We introduce ArV oice, a multi-speaker Modern Standard Arabic (MSA) speech corpus with diacritized transcriptions, intended for multi-speaker speech synthesis, and can be useful for other tasks such as speech-based diacritic restoration, voice conversion, and deepfake detection. ArV oice comprises: (1) a new professionally recorded set from six voice talents with diverse demographics, (2) a modified subset of the Arabic Speech Corpus; and (3) high-quality synthetic speech from two commercial systems. The complete corpus consists of a total of 83.52 hours of speech across 11 voices; around 10 hours consist of human voices from 7 speakers. We train three open-source TTS and two voice conversion systems to illustrate the use cases of the dataset. The corpus is available for research use.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

May-28-2025

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East (0.46)

Genre:
- Research Report > New Finding (0.47)

Industry:
- Information Technology > Security & Privacy (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (0.88)
  - Natural Language > Large Language Model (0.68)
  - Speech
    - Speech Synthesis (0.75)
    - Speech Recognition (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found