OLaPh: Optimal Language Phonemizer
–arXiv.org Artificial Intelligence
Phonemization, the conversion of text into phonemes, is a key step in text-to-speech. Traditional approaches use rule-based transformations and lexicon lookups, while more advanced methods apply preprocessing techniques or neural networks for improved accuracy on out-of-domain vocabulary. However, all systems struggle with names, loanwords, abbreviations, and homographs. This work presents OLaPh (Optimal Language Phonemizer), a framework that combines large lexica, multiple NLP techniques, and compound resolution with a probabilistic scoring function. Evaluations in German and English show improved accuracy over previous approaches, including on a challenging dataset. To further address unresolved cases, we train a large language model on OLaPh-generated data, which achieves even stronger generalization and performance. Together, the framework and LLM improve phonemization consistency and provide a freely available resource for future research.
arXiv.org Artificial Intelligence
Sep-25-2025
- Country:
- Asia > Japan
- Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
- Europe
- Czechia > Prague (0.04)
- Netherlands > South Holland
- Dordrecht (0.04)
- North America
- Canada > Quebec
- Montreal (0.04)
- United States > Massachusetts
- Middlesex County > Cambridge (0.04)
- Canada > Quebec
- Asia > Japan
- Genre:
- Research Report (0.40)
- Technology: