On the Effectiveness of Offline RL for Dialogue Response Generation

Sodhi, Paloma, Wu, Felix, Elenberg, Ethan R., Weinberger, Kilian Q., McDonald, Ryan

Jul-23-2023–arXiv.org Artificial Intelligence

However, this can be expensive to collect. Instead, modelbased metrics to measure utterance similarity, such as A common training technique for language models BERTScore (Zhang et al., 2019) and BLEURT (Sellam is teacher forcing (TF). TF attempts to match et al., 2020), provide a cheaper alternative. These are automated human language exactly, even though identical metrics that capture semantic similarity between meanings can be expressed in different ways. This sentences and tend to have a high correlation with human motivates use of sequence-level objectives for dialogue judgment (Zhang et al., 2019; Sellam et al., 2020).

machine learning, natural language, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

Jul-23-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - Michigan > Washtenaw County
    - Ann Arbor (0.04)
  - Hawaii > Honolulu County
    - Honolulu (0.04)
  - California > Santa Clara County
    - San Jose (0.04)

Genre:
- Research Report
  - Experimental Study (0.46)
  - New Finding (0.46)

Industry:
- Leisure & Entertainment (0.93)
- Media > Film (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Representation & Reasoning (0.93)
  - Machine Learning > Reinforcement Learning (0.70)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found