On the Effectiveness of Offline RL for Dialogue Response Generation
Sodhi, Paloma, Wu, Felix, Elenberg, Ethan R., Weinberger, Kilian Q., McDonald, Ryan
–arXiv.org Artificial Intelligence
However, this can be expensive to collect. Instead, modelbased metrics to measure utterance similarity, such as A common training technique for language models BERTScore (Zhang et al., 2019) and BLEURT (Sellam is teacher forcing (TF). TF attempts to match et al., 2020), provide a cheaper alternative. These are automated human language exactly, even though identical metrics that capture semantic similarity between meanings can be expressed in different ways. This sentences and tend to have a high correlation with human motivates use of sequence-level objectives for dialogue judgment (Zhang et al., 2019; Sellam et al., 2020).
arXiv.org Artificial Intelligence
Jul-23-2023
- Country:
- North America > United States
- California (0.14)
- Hawaii (0.14)
- Michigan (0.14)
- North America > United States
- Genre:
- Research Report
- Experimental Study (0.46)
- New Finding (0.46)
- Research Report
- Industry:
- Leisure & Entertainment (0.93)
- Media > Film (0.67)
- Technology: