We present ConvLab-2, an open-source toolkit that enables researchers to build task-oriented dialogue systems with state-of-the-art models, perform an end-to-end evaluation, and diagnose the weakness of systems. As the successor of ConvLab (Lee et al., 2019b), ConvLab-2 inherits ConvLab's framework but integrates more powerful dialogue models and supports more datasets. Besides, we have developed an analysis tool and an interactive tool to assist researchers in diagnosing dialogue systems. The analysis tool presents rich statistics and summarizes common mistakes from simulated dialogues, which facilitates error analysis and system improvement. The interactive tool provides a user interface that allows developers to diagnose an assembled dialogue system by interacting with the system and modifying the output of each system component.
We describe a corpus-based approach of natural language dialogue system. The characteristic is that all the system's behaviors, like processing and understanding dialogues and generating responses, depend on corpora. As a result, the system can handle any language and any topic. This paper aims to explain the whole architecture and individual technology used in our project.
There is an increasing demand for task-oriented dialogue systems which can assist users in various activities such as booking tickets and restaurant reservations. In order to complete dialogues effectively, dialogue policy plays a key role in task-oriented dialogue systems. As far as we know, the existing task-oriented dialogue systems obtain the dialogue policy through classification, which can assign either a dialogue act and its corresponding parameters or multiple dialogue acts without their corresponding parameters for a dialogue action. In fact, a good dialogue policy should construct multiple dialogue acts and their corresponding parameters at the same time. However, it's hard for existing classification-based methods to achieve this goal. Thus, to address the issue above, we propose a novel generative dialogue policy learning method. Specifically, the proposed method uses attention mechanism to find relevant segments of given dialogue context and input utterance and then constructs the dialogue policy by a seq2seq way for task-oriented dialogue systems. Extensive experiments on two benchmark datasets show that the proposed model significantly outperforms the state-of-the-art baselines. In addition, we have publicly released our codes.
Human conversations in real scenarios are complicated and building a human-like dialogue agent is an extremely challenging task. With the rapid development of deep learning techniques, data-driven models become more and more prevalent which need a huge amount of real conversation data. In this paper, we construct a large-scale real scenario Chinese E-commerce conversation corpus, JDDC, with more than 1 million multi-turn dialogues, 20 million utterances, and 150 million words. The dataset reflects several characteristics of human-human conversations, e.g., goal-driven, and long-term dependency among the context. It also covers various dialogue types including task-oriented, chitchat and question-answering. Extra intent information and three well-annotated challenge sets are also provided. Then, we evaluate several retrieval-based and generative models to provide basic benchmark performance on JDDC corpus. And we hope JDDC can serve as an effective testbed and benefit the development of fundamental research in dialogue task.
This paper describes a novel method by which a spoken dialogue system can learn to choose an optimal dialogue strategy from its experience interacting with human users. The method is based on a combination of reinforcement learning and performance modeling of spoken dialogue systems. The reinforcement learning component applies Q-learning (Watkins, 1989), while the performance modeling component applies the PARADISE evaluation framework (Walker et al., 1997) to learn the performance function (reward) used in reinforcement learning. We illustrate the method with a spoken dialogue system named ELVIS (EmaiL Voice Interactive System), that supports access to email over the phone. We conduct a set of experiments for training an optimal dialogue strategy on a corpus of 219 dialogues in which human users interact with ELVIS over the phone. We then test that strategy on a corpus of 18 dialogues. We show that ELVIS can learn to optimize its strategy selection for agent initiative, for reading messages, and for summarizing email folders.