CASPER: A Large Scale Spontaneous Speech Dataset

Xiao, Cihan, Liang, Ruixing, Zhang, Xiangyu, Tiryaki, Mehmet Emre, Bae, Veronica, Shankar, Lavanya, Yang, Rong, Poon, Ethan, Dupoux, Emmanuel, Khudanpur, Sanjeev, Perera, Leibny Paola Garcia

arXiv.org Artificial Intelligence 

The majority (67.79%) reported speaking US English, reflecting the dataset's primary demographic. However, a significant proportion of non-native and regionally influenced English varieties are also present, including Chinese Mandarin-influenced English (4.81%), UK English (5.29%), and Indian English (2.88%). Additionally, 14.42% of participants did not specify an accent, indicating either an omission or variability in self-identification. The participants' accent and native language are based on their self-identification, for example, the number of speakers with an Arabic accent may differ from the number with Arabic as their native language. Age distribution reveals that younger speakers are over-represented, with 57.21% of participants in the 18-29 age range and 23.56% in the 30-39 range.