Lee, Hoyeon
A Two-Step Approach for Data-Efficient French Pronunciation Learning
Lee, Hoyeon, Jang, Hyeeun, Kim, Jong-Hwan, Kim, Jae-Min
Recent studies have addressed intricate phonological phenomena in French, relying on either extensive linguistic knowledge or a significant amount of sentence-level pronunciation data. However, creating such resources is expensive and non-trivial. To this end, we propose a novel two-step approach that encompasses two pronunciation tasks: grapheme-to-phoneme and post-lexical processing. We then investigate the efficacy of the proposed approach with a notably limited amount of sentence-level pronunciation data. Our findings demonstrate that the proposed two-step approach effectively mitigates the lack of extensive labeled data, and serves as a feasible solution for addressing French phonological phenomena even under resource-constrained environments.
Cross-Lingual Transfer Learning for Phrase Break Prediction with Multilingual Language Model
Lee, Hoyeon, Yoon, Hyun-Wook, Kim, Jong-Hwan, Kim, Jae-Min
Phrase break prediction is a crucial task for improving the prosody naturalness of a text-to-speech (TTS) system. However, most proposed phrase break prediction models are monolingual, trained exclusively on a large amount of labeled data. In this paper, we address this issue for low-resource languages with limited labeled data using cross-lingual transfer. We investigate the effectiveness of zero-shot and few-shot cross-lingual transfer for phrase break prediction using a pre-trained multilingual language model. We use manually collected datasets in four Indo-European languages: one high-resource language and three with limited resources. Our findings demonstrate that cross-lingual transfer learning can be a particularly effective approach, especially in the few-shot setting, for improving performance in low-resource languages. This suggests that cross-lingual transfer can be inexpensive and effective for developing TTS front-end in resource-poor languages.
OpenKBP-Opt: An international and reproducible evaluation of 76 knowledge-based planning pipelines
Babier, Aaron, Mahmood, Rafid, Zhang, Binghao, Alves, Victor G. L., Barragán-Montero, Ana Maria, Beaudry, Joel, Cardenas, Carlos E., Chang, Yankui, Chen, Zijie, Chun, Jaehee, Diaz, Kelly, Eraso, Harold David, Faustmann, Erik, Gaj, Sibaji, Gay, Skylar, Gronberg, Mary, Guo, Bingqi, He, Junjun, Heilemann, Gerd, Hira, Sanchit, Huang, Yuliang, Ji, Fuxin, Jiang, Dashan, Giraldo, Jean Carlo Jimenez, Lee, Hoyeon, Lian, Jun, Liu, Shuolin, Liu, Keng-Chi, Marrugo, José, Miki, Kentaro, Nakamura, Kunio, Netherton, Tucker, Nguyen, Dan, Nourzadeh, Hamidreza, Osman, Alexander F. I., Peng, Zhao, Muñoz, José Darío Quinto, Ramsl, Christian, Rhee, Dong Joo, Rodriguez, Juan David, Shan, Hongming, Siebers, Jeffrey V., Soomro, Mumtaz H., Sun, Kay, Hoyos, Andrés Usuga, Valderrama, Carlos, Verbeek, Rob, Wang, Enpei, Willems, Siri, Wu, Qi, Xu, Xuanang, Yang, Sen, Yuan, Lulin, Zhu, Simeng, Zimmermann, Lukas, Moore, Kevin L., Purdie, Thomas G., McNiven, Andrea L., Chan, Timothy C. Y.
We establish an open framework for developing plan optimization models for knowledge-based planning (KBP) in radiotherapy. Our framework includes reference plans for 100 patients with head-and-neck cancer and high-quality dose predictions from 19 KBP models that were developed by different research groups during the OpenKBP Grand Challenge. The dose predictions were input to four optimization models to form 76 unique KBP pipelines that generated 7600 plans. The predictions and plans were compared to the reference plans via: dose score, which is the average mean absolute voxel-by-voxel difference in dose a model achieved; the deviation in dose-volume histogram (DVH) criterion; and the frequency of clinical planning criteria satisfaction. We also performed a theoretical investigation to justify our dose mimicking models. The range in rank order correlation of the dose score between predictions and their KBP pipelines was 0.50 to 0.62, which indicates that the quality of the predictions is generally positively correlated with the quality of the plans. Additionally, compared to the input predictions, the KBP-generated plans performed significantly better (P<0.05; one-sided Wilcoxon test) on 18 of 23 DVH criteria. Similarly, each optimization model generated plans that satisfied a higher percentage of criteria than the reference plans. Lastly, our theoretical investigation demonstrated that the dose mimicking models generated plans that are also optimal for a conventional planning model. This was the largest international effort to date for evaluating the combination of KBP prediction and optimization models. In the interest of reproducibility, our data and code is freely available at https://github.com/ababier/open-kbp-opt.