GoChat: Goal-oriented Chatbots with Hierarchical Reinforcement Learning

May-26-2020–arXiv.org Artificial Intelligence

A chatbot that converses like a human should be goal-oriented (i.e., be purposeful in conversation), which is beyond language generation. However, existing dialogue systems often heavily rely on cumbersome hand-crafted rules or costly labelled datasets to reach the goals. In this paper, we propose Goal-oriented Chatbots (GoChat), a framework for end-to-end training chatbots to maximize the longterm return from offline multi-turn dialogue datasets. Our framework utilizes hierarchical reinforcement learning (HRL), where the high-level policy guides the conversation towards the final goal by determining some sub-goals, and the low-level policy fulfills the sub-goals by generating the corresponding utterance for response. In our experiments on a real-world dialogue dataset for anti-fraud in financial, our approach outperforms previous methods on both the quality of response generation as well as the success rate of accomplishing the goal.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

May-26-2020

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.05)
- North America > United States
  - New York > New York County > New York City (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Chatbot (1.00)
  - Machine Learning
    - Reinforcement Learning (0.86)
    - Statistical Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found