Learning to Dialogue via Complex Hindsight Experience Replay

Lu, Keting, Zhang, Shiqi, Chen, Xiaoping

Aug-20-2018–arXiv.org Artificial Intelligence

Reinforcement learning methods have been used for learning dialogue policies from the experience of conversations. However, learning an effective dialogue policy frequently requires prohibitively many conversations. This is partly because of the sparse rewards in dialogues, and the relatively small number of successful dialogues in early learning phase. Hindsight experience replay (HER) enables an agent to learn from failure, but the vanilla HER is inapplicable to dialogue domains due to dialogue goals being implicit (c.f., explicit goals in manipulation tasks). In this work, we develop two complex HER methods providing different trade-offs between complexity and performance. Experiments were conducted using a realistic user simulator. Results suggest that our HER methods perform better than standard and prioritized experience replay methods (as applied to deep Q-networks) in learning rate, and that our two complex HER methods can be combined to produce the best performance.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

Aug-20-2018

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.04)
- North America
  - United States
    - Pennsylvania > Allegheny County
      - Pittsburgh (0.04)
    - New York > Broome County
      - Binghamton (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)

Genre:
- Research Report > New Finding (0.34)

Industry:
- Leisure & Entertainment (0.68)
- Media > Film (0.46)
- Education > Educational Setting (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Discourse & Dialogue (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found