The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech

Do, Phat, Coler, Matt, Dijkstra, Jelske, Klabbers, Esther

Jun-1-2023–arXiv.org Artificial Intelligence

We compare phone labels and articulatory features as input for cross-lingual transfer learning in text-to-speech (TTS) for low-resource languages (LRLs). Experiments with FastSpeech 2 and the LRL West Frisian show that using articulatory features outperformed using phone labels in both intelligibility and naturalness. For LRLs without pronunciation dictionaries, we propose two novel approaches: a) using a massively multilingual model to convert grapheme-to-phone (G2P) in both training and synthesizing, and b) using a universal phone recognizer to create a makeshift dictionary. Results show that the G2P approach performs largely on par with using a ground-truth dictionary and the phone recognition approach, while performing generally worse, remains a viable option for LRLs less suitable for the G2P approach. Within each approach, using articulatory features as input outperforms using phone labels.

articulatory feature, lrl, phone label, (16 more...)

arXiv.org Artificial Intelligence

Jun-1-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - New York > New York County
      - New York City (0.04)
    - California > San Diego County
      - San Diego (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe
  - Netherlands (0.05)
  - Slovenia (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.05)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)

Genre:
- Research Report > New Finding (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Speech > Speech Synthesis (0.72)
  - Machine Learning > Transfer Learning (0.63)
  - Vision > Optical Character Recognition (0.62)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found