Serban, Iulian V., Sankar, Chinnadhurai, Germain, Mathieu, Zhang, Saizheng, Lin, Zhouhan, Subramanian, Sandeep, Kim, Taesup, Pieper, Michael, Chandar, Sarath, Ke, Nan Rosemary, Rajeshwar, Sai, de Brebisson, Alexandre, Sotelo, Jose M. R., Suhubdy, Dendi, Michalski, Vincent, Nguyen, Alexandre, Pineau, Joelle, Bengio, Yoshua
We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including template-based models, bag-of-words models, sequence-to-sequence neural network and latent variable neural network models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than many competing systems. Due to its machine learning architecture, the system is likely to improve with additional data.
Recently, neural network based dialogue systems have become ubiquitous in our increasingly digitalized society. However, due to their inherent opaqueness, some recently raised concerns about using neural models are starting to be taken seriously. In fact, intentional or unintentional behaviors could lead to a dialogue system to generate inappropriate responses. Thus, in this paper, we investigate whether we can learn to craft input sentences that result in a black-box neural dialogue model being manipulated into having its outputs contain target words or match target sentences. We propose a reinforcement learning based model that can generate such desired inputs automatically. Extensive experiments on a popular well-trained state-of-the-art neural dialogue model show that our method can successfully seek out desired inputs that lead to the target outputs in a considerable portion of cases. Consequently, our work reveals the potential of neural dialogue models to be manipulated, which inspires and opens the door towards developing strategies to defend them.
We introduce end-to-end neural network based models for simulating users of task-oriented dialogue systems. User simulation in dialogue systems is crucial from two different perspectives: (i) automatic evaluation of different dialogue models, and (ii) training task-oriented dialogue systems. We design a hierarchical sequence-to-sequence model that first encodes the initial user goal and system turns into fixed length representations using Recurrent Neural Networks (RNN). It then encodes the dialogue history using another RNN layer. At each turn, user responses are decoded from the hidden representations of the dialogue level RNN. This hierarchical user simulator (HUS) approach allows the model to capture undiscovered parts of the user goal without the need of an explicit dialogue state tracking. We further develop several variants by utilizing a latent variable model to inject random variations into user responses to promote diversity in simulated user responses and a novel goal regularization mechanism to penalize divergence of user responses from the initial user goal. We evaluate the proposed models on movie ticket booking domain by systematically interacting each user simulator with various dialogue system policies trained with different objectives and users.
We're going to be blunt: Chatbots in their current form aren't great. We were promised bots that would change the way we interact with businesses and services, but instead we have interactive bots that perform worse than apps. They are primarily focused on taps or interactive graphical interfaces, and conversing with them using natural language is nearly impossible. Take an example of Poncho Weather on Facebook Messenger. Let's say I'm going to a conference next Monday in San Diego and want to know what the forecast is.
Communication is a cooperative effort that requires reaching mutual understanding among the participants. Humans use commonsense reasoning implicitly to produce natural and logically-coherent responses. As a step towards fluid human-AI communication, we study if response generation (RG) models can emulate human reasoning process and use common sense to help produce better-quality responses. We aim to tackle two research questions: how to formalize conversational common sense and how to examine RG models capability to use common sense? We first propose a task, CEDAR: Causal common sEnse in DiAlogue Response generation, that concretizes common sense as textual explanations for what might lead to the response and evaluates RG models behavior by comparing the modeling loss given a valid explanation with an invalid one. Then we introduce a process that automatically generates such explanations and ask humans to verify them. Finally, we design two probing settings for RG models targeting two reasoning capabilities using verified explanations. We find that RG models have a hard time determining the logical validity of explanations but can identify grammatical naturalness of the explanation easily.