Serban, Iulian V., Sankar, Chinnadhurai, Germain, Mathieu, Zhang, Saizheng, Lin, Zhouhan, Subramanian, Sandeep, Kim, Taesup, Pieper, Michael, Chandar, Sarath, Ke, Nan Rosemary, Rajeswar, Sai, de Brebisson, Alexandre, Sotelo, Jose M. R., Suhubdy, Dendi, Michalski, Vincent, Nguyen, Alexandre, Pineau, Joelle, Bengio, Yoshua
We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including neural network and template-based models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than other systems. The results highlight the potential of coupling ensemble systems with deep reinforcement learning as a fruitful path for developing real-world, open-domain conversational agents.
Eighteen teams were selected for the inaugural competition last year. To build their socialbots, the students combined state-of-the-art techniques with their own novel strategies in the areas of natural language understanding and conversational AI. This article reports on the research conducted over the 2017-2018 year. While the 20-minute grand challenge was not achieved in the first year, the competition produced several conversational agents that advanced the state of the art, that are interesting for everyday users to interact with, and that help form a baseline for the second year of the competition. We conclude with a summary of the human conversation have applicability in both work that we plan to address in the second year of professional and everyday domains. The first generation of such assistants -- Amazon's Alexa, Apple's Siri, Google The Alexa Prize competition received hundreds of Assistant, and Microsoft's Cortana -- have been applications from interested universities. After a focused on short, task-oriented interactions, such as detailed review of the applications, Amazon playing music or answering simple questions, as announced 12 sponsored and 6 unsponsored teams opposed to the longer free-form conversations that as the inaugural cohort for the Alexa Prize. The teams occur naturally in social and professional human that went live for the 2017 competition, listed alphabetically interaction. Conversational AI is the study of techniques by university, were DeisBot (Brandeis University), for creating software agents that can engage Magnus (Carnegie Mellon University), in natural conversational interactions with humans.
Neural network-based Open-ended conversational agents automatically generate responses based on predictive models learned from a large number of pairs of utterances. The generated responses are typically acceptable as a sentence but are often dull, generic, and certainly devoid of any emotion. In this paper, we present neural models that learn to express a given emotion in the generated response. We propose four models and evaluate them against 3 baselines. An encoder-decoder framework-based model with multiple attention layers provides the best overall performance in terms of expressing the required emotion. While it does not outperform other models on all emotions, it presents promising results in most cases.
This paper proposes a deep neural network model for joint modeling Natural Language Understanding (NLU) and Dialogue Management (DM) in goal-driven dialogue systems. There are three parts in this model. A Long Short-Term Memory (LSTM) at the bottom of the network encodes utterances in each dialogue turn into a turn embedding. Dialogue embeddings are learned by a LSTM at the middle of the network, and updated by the feeding of all turn embeddings. The top part is a forward Deep Neural Network which converts dialogue embeddings into the Q-values of different dialogue actions. The cascaded LSTMs based reinforcement learning network is jointly optimized by making use of the rewards received at each dialogue turn as the only supervision information. There is no explicit NLU and dialogue states in the network. Experimental results show that our model outperforms both traditional Markov Decision Process (MDP) model and single LSTM with Deep Q-Network on meeting room booking tasks. Visualization of dialogue embeddings illustrates that the model can learn the representation of dialogue states.
We propose an adversarial learning approach to the generation of multi-turn dialogue responses. Our proposed framework, hredGAN, is based on conditional generative adversarial networks (GANs). The GAN's generator is a modified hierarchical recurrent encoder-decoder network (HRED) and the discriminator is a word-level bidirectional RNN that shares context and word embedding with the generator. During inference, noise samples conditioned on the dialogue history are used to perturb the generator's latent space to generate several possible responses. The final response is the one ranked best by the discriminator. The hredGAN shows major advantages over existing methods: (1) it generalizes better than networks trained using only the log-likelihood criterion, and (2) it generates longer, more informative and more diverse responses with high utterance and topic relevance even with limited training data. This superiority is demonstrated on the Movie triples and Ubuntu dialogue datasets in terms of perplexity, BLEU, ROUGE and Distinct n-gram scores.