Discourse & Dialogue
A Bandit Approach to Posterior Dialog Orchestration Under a Budget
Upadhyay, Sohini, Agarwal, Mayank, Bounneffouf, Djallel, Khazaeni, Yasaman
Building multi-domain AI agents is a challenging task and an open problem in the area of AI. Within the domain of dialog, the ability to orchestrate multiple independently trained dialog agents, or skills, to create a unified system is of particular significance. In this work, we study the task of online posterior dialog orchestration, where we define posterior orchestration as the task of selecting a subset of skills which most appropriately answer a user input using features extracted from both the user input and the individual skills. To account for the various costs associated with extracting skill features, we consider online posterior orchestration under a skill execution budget. We formalize this setting as Context Attentive Bandit with Observations (CABO), a variant of context attentive bandits, and evaluate it on simulated non-conversational and proprietary conversational datasets.
Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems
Ghandeharioun, Asma, Shen, Judy Hanwen, Jaques, Natasha, Ferguson, Craig, Jones, Noah, Lapedriza, Agata, Picard, Rosalind
Building an open-domain conversational agent is a challenging problem. Current evaluation methods, mostly post-hoc judgments of single-turn evaluation, do not capture conversation quality in a realistic interactive context. In this paper, we investigate interactive human evaluation and provide evidence for its necessity; we then introduce a novel, model-agnostic, and dataset-agnostic method to approximate it. In particular, we propose a self-play scenario where the dialog system talks to itself and we calculate a combination of proxies such as sentiment and semantic coherence on the conversation trajectory. We show that this metric is capable of capturing the human-rated quality of a dialog model better than any automated metric known to-date, achieving a significant Pearson correlation (r>.7, p<.05). To investigate the strengths of this novel metric and interactive evaluation in comparison to state-of-the-art metrics and one-turn evaluation, we perform extended experiments with a set of models, including several that make novel improvements to recent hierarchical dialog generation architectures through sentiment and semantic knowledge distillation on the utterance level. Finally, we open-source the interactive evaluation platform we built and the dataset we collected to allow researchers to efficiently deploy and evaluate generative dialog models.
Interactive Topic Modeling with Anchor Words
Dasgupta, Sanjoy, Poulis, Stefanos, Tosh, Christopher
The formalism of anchor words has enabled the development of fast topic modeling algorithms with provable guarantees. In this paper, we introduce a protocol that allows users to interact with anchor words to build customized and interpretable topic models. Experimental evidence validating the usefulness of our approach is also presented.
Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good
Wang, Xuewei, Shi, Weiyan, Kim, Richard, Oh, Yoojung, Yang, Sijia, Zhang, Jingwen, Yu, Zhou
Developing intelligent persuasive conversational agents to change people's opinions and actions for social good is the frontier in advancing the ethical development of automated dialogue systems. To do so, the first step is to understand the intricate organization of strategic disclosures and appeals employed in human persuasion conversations. We designed an online persuasion task where one participant was asked to persuade the other to donate to a specific charity. We collected a large dataset with 1,017 dialogues and annotated emerging persuasion strategies from a subset. Based on the annotation, we built a baseline classifier with context information and sentence-level features to predict the 10 persuasion strategies used in the corpus. Furthermore, to develop an understanding of personalized persuasion processes, we analyzed the relationships between individuals' demographic and psychological backgrounds including personality, morality, value systems, and their willingness for donation. Then, we analyzed which types of persuasion strategies led to a greater amount of donation depending on the individuals' personal backgrounds. This work lays the ground for developing a personalized persuasive dialogue system.
Unsupervised Neural Single-Document Summarization of Reviews via Learning Latent Discourse Structure and its Ranking
Isonuma, Masaru, Mori, Junichiro, Sakata, Ichiro
This paper focuses on the end-to-end abstractive summarization of a single product review without supervision. We assume that a review can be described as a discourse tree, in which the summary is the root, and the child sentences explain their parent in detail. By recursively estimating a parent from its children, our model learns the latent discourse tree without an external parser and generates a concise summary. We also introduce an architecture that ranks the importance of each sentence on the tree to support summary generation focusing on the main review point. The experimental results demonstrate that our model is competitive with or outperforms other unsupervised approaches. In particular, for relatively long reviews, it achieves a competitive or better performance than supervised models. The induced tree shows that the child sentences provide additional information about their parent, and the generated summary abstracts the entire review.
E3: Entailment-driven Extracting and Editing for Conversational Machine Reading
Zhong, Victor, Zettlemoyer, Luke
Conversational machine reading systems help users answer high-level questions (e.g. determine if they qualify for particular government benefits) when they do not know the exact rules by which the determination is made(e.g. whether they need certain income levels or veteran status). The key challenge is that these rules are only provided in the form of a procedural text (e.g. guidelines from government website) which the system must read to figure out what to ask the user. We present a new conversational machine reading model that jointly extracts a set of decision rules from the procedural text while reasoning about which are entailed by the conversational history and which still need to be edited to create questions for the user. On the recently introduced ShARC conversational machine reading dataset, our Entailment-driven Extract and Edit network (E3) achieves a new state-of-the-art, outperforming existing systems as well as a new BERT-based baseline. In addition, by explicitly highlighting which information still needs to be gathered, E3 provides a more explainable alternative to prior work. We release source code for our models and experiments at https://github.com/vzhong/e3.
Using Structured Representation and Data: A Hybrid Model for Negation and Sentiment in Customer Service Conversations
Misra, Amita, Bhuiyan, Mansurul, Mahmud, Jalal, Tripathy, Saurabh
Twitter customer service interactions have recently emerged as an effective platform to respond and engage with customers. In this work, we explore the role of negation in customer service interactions, particularly applied to sentiment analysis. We define rules to identify true negation cues and scope more suited to conversational data than existing general review data. Using semantic knowledge and syntactic structure from constituency parse trees, we propose an algorithm for scope detection that performs comparable to state of the art BiLSTM. We further investigate the results of negation scope detection for the sentiment prediction task on customer service conversation data using both a traditional SVM and a Neural Network. We propose an antonym dictionary based method for negation applied to a CNN-LSTM combination model for sentiment analysis. Experimental results show that the antonym-based method outperforms the previous lexicon-based and neural network methods.
Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention
Chen, Wenhu, Chen, Jianshu, Qin, Pengda, Yan, Xifeng, Wang, William Yang
Semantically controlled neural response generation on limited-domain has achieved great performance. However, moving towards multi-domain large-scale scenarios are shown to be difficult because the possible combinations of semantic inputs grow exponentially with the number of domains. To alleviate such scalability issue, we exploit the structure of dialog acts to build a multi-layer hierarchical graph, where each act is represented as a root-to-leaf route on the graph. Then, we incorporate such graph structure prior as an inductive bias to build a hierarchical disentangled self-attention network, where we disentangle attention heads to model designated nodes on the dialog act graph. By activating different (disentangled) heads at each layer, combinatorially many dialog act semantics can be modeled to control the neural response generation. On the large-scale Multi-Domain-WOZ dataset, our model can yield a significant improvement over the baselines on various automatic and human evaluation metrics.
Forward and Backward Knowledge Transfer for Sentiment Classification
Wang, Hao, Liu, Bing, Wang, Shuai, Ma, Nianzu, Yang, Yan
This paper studies the problem of learning a sequence of sentiment classification tasks. The learned knowledge from each task is retained and used to help future or subsequent task learning. This learning paradigm is called Lifelong Learning (LL). However, existing LL methods either only transfer knowledge forward to help future learning and do not go back to improve the model of a previous task or require the training data of the previous task to retrain its model to exploit backward/reverse knowledge transfer. This paper studies reverse knowledge transfer of LL in the context of naive Bayesian (NB) classification. It aims to improve the model of a previous task by leveraging future knowledge without retraining using its training data. This is done by exploiting a key characteristic of the generative model of NB. That is, it is possible to improve the NB classifier for a task by improving its model parameters directly by using the retained knowledge from other tasks. Experimental results show that the proposed method markedly outperforms existing LL baselines.
Sparse Parallel Training of Hierarchical Dirichlet Process Topic Models
Terenin, Alexander, Magnusson, Måns, Jonsson, Leif
Nonparametric extensions of topic models such as Latent Dirichlet Allocation, including Hierarchical Dirichlet Process (HDP), are often studied in natural language processing. Training these models generally requires use of serial algorithms, which limits scalability to large data sets and complicates acceleration via use of parallel and distributed systems. Most current approaches to scalable training of such models either don't converge to the correct target, or are not data-parallel. Moreover, these approaches generally do not utilize all available sources of sparsity found in natural language - an important way to make computation efficient. Based upon a representation of certain conditional distributions within an HDP, we propose a doubly sparse data-parallel sampler for the HDP topic model that addresses these issues.