ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios

Wang, Yuyue, Xiao, Huan, Wu, Yihan, Song, Ruihua

May-20-2023–arXiv.org Artificial Intelligence

Text to Speech (TTS) models can generate natural and high-quality speech, but it is not expressive enough when synthesizing speech with dramatic expressiveness, such as stand-up comedies. Considering comedians have diverse personal speech styles, including personal prosody, rhythm, and fillers, it requires real-world datasets and strong speech style modeling capabilities, which brings challenges. In this paper, we construct a new dataset and develop ComedicSpeech, a TTS system tailored for the stand-up comedy synthesis in low-resource scenarios. First, we extract prosody representation by the prosody encoder and condition it to the TTS model in a flexible way. Second, we enhance the personal rhythm modeling by a conditional duration predictor. Third, we model the personal fillers by introducing comedian-related special tokens. Experiments show that ComedicSpeech achieves better expressiveness than baselines with only ten-minute training data for each comedian. The audio samples are available at https://xh621.github.io/stand-up-comedy-demo/

artificial intelligence, comedian, machine learning, (16 more...)

arXiv.org Artificial Intelligence

May-20-2023

arXiv.org PDF

Add feedback

Country:
- North America > Canada
  - British Columbia > Metro Vancouver Regional District
    - Vancouver (0.04)
  - Alberta > Census Division No. 6
    - Calgary Metropolitan Region > Calgary (0.04)
- Europe
  - Austria (0.04)
  - Sweden > Stockholm
    - Stockholm (0.04)
  - Czechia > South Moravian Region
    - Brno (0.04)
- Asia
  - South Korea > Incheon
    - Incheon (0.05)
  - China > Guangdong Province
    - Shenzhen (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Speech > Speech Synthesis (0.74)
  - Vision > Optical Character Recognition (0.63)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found