Leveraging Implicit Feedback from Deployment Data in Dialogue

Pang, Richard Yuanzhe, Roller, Stephen, Cho, Kyunghyun, He, He, Weston, Jason

Jul-26-2023–arXiv.org Artificial Intelligence

We study improving social conversational agents by learning from natural dialogue between users and a deployed model, without extra annotations. To implicitly measure the quality of a machine-generated utterance, we leverage signals like user response length, sentiment and reaction of the future human utterances in the collected dialogue episodes. Our experiments use the publicly released deployment data from BlenderBot (Xu et al., 2023). Human evaluation indicates improvements in our new models over baseline responses; however, we find that some proxy signals can lead to more generations with undesirable properties as well. For example, optimizing for conversation length can lead to more controversial or unfriendly generations compared to the baseline, whereas optimizing for positive sentiment or reaction can decrease these behaviors.

machine learning, natural language, speaker 1, (16 more...)

arXiv.org Artificial Intelligence

Jul-26-2023

arXiv.org PDF

Add feedback

Country:
- Europe (0.67)
- North America > United States
  - Maine (0.14)

Genre:
- Personal > Interview (1.00)
- Research Report (1.00)

Industry:
- Health & Medicine (0.93)
- Leisure & Entertainment (1.00)
- Media > Music (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)
  - Natural Language
    - Chatbot (0.88)
    - Large Language Model (0.70)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found