Code-switched inspired losses for generic spoken dialog representations

Chapuis, Emile, Colombo, Pierre, Labeau, Matthieu, Clavel, Chloe

arXiv.org Artificial Intelligence 

While there has been a growing interest in pretraining for dialog A crucial step in conversational AI is the identification (Mehri et al., 2019; Zhang et al., 2019d), the focus of underlying information of the user's utterance has mainly been on English datasets. Thus, these (e.g communicative intent or dialog acts, and works can not be directly applied to our multilingual emotions). This requires modeling utterance-level setting. Additionally, available multilingual information (Mitkov, 2014; Williams et al., 2014), pretraining objectives (Lample and Conneau, 2019; to capture immediate nuances of the user utterance; Liu et al., 2020; Xue et al., 2020; Qi et al., 2021) and discourse-level features (Thornbury and Slade, face two main limitations when applied to dialog 2006), to capture patterns over long ranges of the modeling: (1) they are a generalization of monolingual conversation. An added difficulty to this modeling objectives that use flat input text, whereas problem is that most people in the world are bilingual hierarchy has been shown to be a powerful prior (Grosjean and Li, 2013): therefore, progress for dialog modeling. This is a reflection of a dialog on these systems is limited by their inability to process itself, for example, context plays an essential role more than one language (English being the in the labeling of dialog acts.