Goto

Collaborating Authors

 Discourse & Dialogue


Multi-Domain Dialogue State Tracking -- A Purely Transformer-Based Generative Approach

arXiv.org Artificial Intelligence

We investigate the problem of multi-domain Dialogue State Tracking (DST) with open vocabulary. Existing approaches exploit BERT encoder and copy-based RNN decoder, where the encoder first predicts the state operation, and then the decoder generates new slot values. However, in this stacked encoder-decoder structure, the operation prediction objective only affects the BERT encoder and the value generation objective mainly affects the RNN decoder. In this paper, we propose a purely Transformer-based framework that uses BERT as both encoder and decoder. In so doing, the operation prediction objective and the value generation objective can jointly optimize our model for DST. At the decoding step, we re-use the hidden states of the encoder in the self-attention mechanism of the corresponding decoder layer to construct a flat model structure for effective parameter updating. Experimental results show that our approach substantially outperforms the existing state-of-the-art framework, and it also achieves very competitive performance to the best ontology-based approaches.


Migratable AI: Personalizing Dialog Conversations with migration context

arXiv.org Artificial Intelligence

The migration of conversational AI agents across different embodiments in order to maintain the continuity of the task has been recently explored to further improve user experience. However, these migratable agents lack contextual understanding of the user information and the migrated device during the dialog conversations with the user. This opens the question of how an agent might behave when migrated into an embodiment for contextually predicting the next utterance. We collected a dataset from the dialog conversations between crowdsourced workers with the migration context involving personal and non-personal utterances in different settings (public or private) of embodiment into which the agent migrated. We trained the generative and information retrieval models on the dataset using with and without migration context and report the results of both qualitative metrics and human evaluation. We believe that the migration dataset would be useful for training future migratable AI systems.


Multi-Domain Dialogue State Tracking based on State Graph

arXiv.org Artificial Intelligence

We investigate the problem of multi-domain Dialogue State Tracking (DST) with open vocabulary, which aims to extract the state from the dialogue. Existing approaches usually concatenate previous dialogue state with dialogue history as the input to a bi-directional Transformer encoder. They rely on the self-attention mechanism of Transformer to connect tokens in them. However, attention may be paid to spurious connections, leading to wrong inference. In this paper, we propose to construct a dialogue state graph in which domains, slots and values from the previous dialogue state are connected properly. Through training, the graph node and edge embeddings can encode co-occurrence relations between domain-domain, slot-slot and domain-slot, reflecting the strong transition paths in general dialogue. The state graph, encoded with relational-GCN, is fused into the Transformer encoder. Experimental results show that our approach achieves a new state of the art on the task while remaining efficient. It outperforms existing open-vocabulary DST approaches.


A Graph Based and Patient Demographics Aware Dialogue System for Disease Diagnosis

arXiv.org Artificial Intelligence

A dialogue system for disease diagnosis aims at making a diagnosis by conversing with patients. Existing disease diagnosis dialogue systems highly rely on data-driven methods and statistical features, lacking profound comprehension of medical knowledge, such as symptom-disease relations. In addition, previous work pays less attention to demographic attributes of a patient, which are important factors in clinical diagnoses. To tackle these issues, this work presents a graph based and demographic attributes aware dialogue system for disease diagnosis. Specifically, we first build a weighted bidirectional graph based on clinical dialogues to depict the relationship between symptoms and diseases and then present a bidirectional graph based deep Q-network (BG-DQN) for dialogue management. By extending Graph Convolutional Network (GCN) to learn the embeddings of diseases and symptoms from both the structural and attribute information in the graph, BG-DQN could capture the relations between diseases and symptoms better. Moreover, BG-DQN also encodes the demographic attributes of a patient to assist the disease diagnosis process. Experimental results show that the proposed dialogue system outperforms several competitive methods in terms of diagnostic accuracy. More importantly, our method can complete the task with less dialogue turns and possesses better distinguishing capability on diseases with similar symptoms.


Generating Strategic Dialogue for Negotiation with Theory of Mind

arXiv.org Artificial Intelligence

We propose a framework to integrate the concept of Theory of Mind (ToM) into generating utterances for task-oriented dialogue. Our approach explores the ability to model and infer personality types of opponents, predicts their responses, and uses this information to adapt the agent's high-level strategy in negotiation tasks. We introduce a probabilistic formulation for the first-order theory of mind and test our approach on the CraigslistBargain dataset. Experiments show that our method using ToM inference achieves a 40\% higher dialogue agreement rate compared to baselines on a mixed population of opponents. We also show that our model displays diverse negotiation behavior with different types of opponents.


Example-Driven Intent Prediction with Observers

arXiv.org Artificial Intelligence

A key challenge of dialog systems research is to effectively and efficiently adapt to new domains. A scalable paradigm for adaptation necessitates the development of generalizable models that perform well in few-shot settings. In this paper, we focus on the intent classification problem which aims to identify user intents given utterances addressed to the dialog system. We propose two approaches for improving the generalizability of utterance classification models: (1) example-driven training and (2) observers. Example-driven training learns to classify utterances by comparing to examples, thereby using the underlying encoder as a sentence similarity model. Prior work has shown that BERT-like models tend to attribute a significant amount of attention to the [CLS] token, which we hypothesize results in diluted representations. Observers are tokens that are not attended to, and are an alternative to the [CLS] token. The proposed methods attain state-of-the-art results on three intent prediction datasets (Banking, Clinc}, and HWU) in both the full data and few-shot (10 examples per intent) settings. Furthermore, we demonstrate that the proposed approach can transfer to new intents and across datasets without any additional training.


Neural Topic Model via Optimal Transport

arXiv.org Machine Learning

Recently, Neural Topic Models (NTMs) inspired by variational autoencoders have obtained increasingly research interest due to their promising results on text analysis. However, it is usually hard for existing NTMs to achieve good document representation and coherent/diverse topics at the same time. Moreover, they often degrade their performance severely on short documents. The requirement of reparameterisation could also comprise their training quality and model flexibility. To address these shortcomings, we present a new neural topic model via the theory of optimal transport (OT). Specifically, we propose to learn the topic distribution of a document by directly minimising its OT distance to the document's word distributions. Importantly, the cost matrix of the OT distance models the weights between topics and words, which is constructed by the distances between topics and words in an embedding space. Our proposed model can be trained efficiently with a differentiable loss. Extensive experiments show that our framework significantly outperforms the state-of-the-art NTMs on discovering more coherent and diverse topics and deriving better document representations for both regular and short texts.


NUIG-Shubhanker@Dravidian-CodeMix-FIRE2020: Sentiment Analysis of Code-Mixed Dravidian text using XLNet

arXiv.org Artificial Intelligence

Social media has penetrated into multilingual societies, however most of them use English to be a preferred language for communication. So it looks natural for them to mix their cultural language with English during conversations resulting in abundance of multilingual data, call this code-mixed data, available in todays' world.Downstream NLP tasks using such data is challenging due to the semantic nature of it being spread across multiple languages.One such Natural Language Processing task is sentiment analysis, for this we use an auto-regressive XLNet model to perform sentiment analysis on code-mixed Tamil-English and Malayalam-English datasets.


Context-Guided BERT for Targeted Aspect-Based Sentiment Analysis

arXiv.org Artificial Intelligence

Aspect-based sentiment analysis (ABSA) and Targeted ASBA (TABSA) allow finer-grained inferences about sentiment to be drawn from the same text, depending on context. For example, a given text can have different targets (e.g., neighborhoods) and different aspects (e.g., price or safety), with different sentiment associated with each target-aspect pair. In this paper, we investigate whether adding context to self-attention models improves performance on (T)ABSA. We propose two variants of Context-Guided BERT (CG-BERT) that learn to distribute attention under different contexts. We first adapt a context-aware Transformer to produce a CG-BERT that uses context-guided softmax-attention. Next, we propose an improved Quasi-Attention CG-BERT model that learns a compositional attention that supports subtractive attention. We train both models with pretrained BERT on two (T)ABSA datasets: SentiHood and SemEval-2014 (Task 4). Both models achieve new state-of-the-art results with our QACG-BERT model having the best performance. Furthermore, we provide analyses of the impact of context in the our proposed models. Our work provides more evidence for the utility of adding context-dependencies to pretrained self-attention-based language models for context-based natural language tasks.


How to Run Sentiment Analysis in Python using VADER

#artificialintelligence

We have explained how to get a sentiment score for words in Python. Instead of building our own lexicon, we can use a pre-trained one like the VADER which stands from Valence Aware Dictionary and sEntiment Reasoner and is specifically attuned to sentiments expressed in social media. You can install the VADER library using pip like pip install vaderSentiment or you can get it directly from NTLK. You can have a look at VADER documentation. Notice that the pos, neu and neg probabilities add up to 1. Also, the compound score is a very useful metric in case we want a single measure of sentiment.