On the Effectiveness of Offline RL for Dialogue Response Generation

Sodhi, Paloma, Wu, Felix, Elenberg, Ethan R., Weinberger, Kilian Q., McDonald, Ryan

arXiv.org Artificial Intelligence 

However, this can be expensive to collect. Instead, modelbased metrics to measure utterance similarity, such as A common training technique for language models BERTScore (Zhang et al., 2019) and BLEURT (Sellam is teacher forcing (TF). TF attempts to match et al., 2020), provide a cheaper alternative. These are automated human language exactly, even though identical metrics that capture semantic similarity between meanings can be expressed in different ways. This sentences and tend to have a high correlation with human motivates use of sequence-level objectives for dialogue judgment (Zhang et al., 2019; Sellam et al., 2020).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found