Dialog Action-Aware Transformer for Dialog Policy Learning

Wang, Huimin, Kwan, Wai-Chung, Wong, Kam-Fai

Sep-5-2023–arXiv.org Artificial Intelligence

Recent works usually address Dialog policy learning DPL by training a reinforcement learning (RL) agent to determine the best dialog action. However, existing works on deep RL require a large volume of agent-user interactions to achieve acceptable performance. In this paper, we propose to make full use of the plain text knowledge from the pre-trained language model to accelerate the RL agent's learning speed. Specifically, we design a dialog action-aware transformer encoder (DaTrans), which integrates a new fine-tuning procedure named masked last action task to encourage DaTrans to be dialog-aware and distils action-specific features. Then, DaTrans is further optimized in an RL setting with ongoing interactions and evolves through exploration in the dialog action space toward maximizing long-term accumulated rewards. The effectiveness and efficiency of the proposed model are demonstrated with both simulator evaluation and human evaluation.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

Sep-5-2023

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.29)
- Europe (0.93)
- North America > United States
  - Minnesota > Hennepin County > Minneapolis (0.14)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Education (0.69)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found