Goto

Collaborating Authors

 Discourse & Dialogue


Getting Started with Natural Language Processing: US Airline Sentiment Analysis

#artificialintelligence

Natural Language Processing (NLP) is a subfield of machine learning concerned with processing and analyzing natural language data, usually in the form of text or audio. Some common challenges within NLP include speech recognition, text generation, and sentiment analysis, while some high-profile products deploying NLP models include Apple's Siri, Amazon's Alexa, and many of the chatbots one might interact with online. To get started with NLP and introduce some of the core concepts in the field, we're going to build a model that tries to predict the sentiment (positive, neutral, or negative) of tweets relating to US Airlines, using the popular Twitter US Airline Sentiment dataset. Code snippets will be included in this post, but for fully reproducible notebooks and scripts, view all of the notebooks and scripts associated with this project on its Comet project page. Let's start by importing some libraries.


Building Task-Oriented Visual Dialog Systems Through Alternative Optimization Between Dialog Policy and Language Generation

arXiv.org Artificial Intelligence

Reinforcement learning (RL) is an effective approach to learn an optimal dialog policy for task-oriented visual dialog systems. A common practice is to apply RL on a neural sequence-to-sequence (seq2seq) framework with the action space being the output vocabulary in the decoder. However, it is difficult to design a reward function that can achieve a balance between learning an effective policy and generating a natural dialog response. This paper proposes a novel framework that alternatively trains a RL policy for image guessing and a supervised seq2seq model to improve dialog generation quality. We evaluate our framework on the GuessWhich task and the framework achieves the state-of-the-art performance in both task completion and dialog quality.


TransSent: Towards Generation of Structured Sentences with Discourse Marker

arXiv.org Artificial Intelligence

This paper focuses on the task of generating long structured sentences with explicit discourse markers, by proposing a new task Sentence Transfer and a novel model architecture TransSent. Previous works on text generation fused semantic and structure information in one mixed hidden representation. However, the structure was difficult to maintain properly when the generated sentence became longer. In this work, we explicitly separate the modeling process of semantic information and structure information. Intuitively, humans produce long sentences by directly connecting discourses with discourse markers like and, but, etc. We thus define a new task called Sentence Transfer. This task represents a long sentence as (head discourse, discourse marker, tail discourse) and aims at tail discourse generation based on head discourse and discourse marker. Then, by connecting original head discourse and generated tail discourse with a discourse marker, we generate a long structured sentence. We also propose a model architecture called TransSent, which models relations between two discourses by interpreting them as transferring from one discourse to the other in the embedding space. Experiment results show that our model achieves better performance in automatic evaluations, and can generate structured sentences with high quality. The datasets can be accessed by https://github.com/1024er/TransSent dataset.


An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction

arXiv.org Artificial Intelligence

Task-oriented dialog systems need to know when a query falls outside their range of supported intents, but current text classification corpora only define label sets that cover every example. We introduce a new dataset that includes queries that are out-of-scope-- i.e., queries that do not fall into any of the system's supported intents. This poses a new challenge because models cannot assume that every query at inference time belongs to a system-supported intent class. Our dataset also covers 150 intent classes over 10 domains, capturing the breadth that a production task-oriented agent must handle. We evaluate a range of benchmark classifiers on our dataset along with several different out-of-scope identification schemes. We find that while the classifiers perform well on in-scope intent classification, they struggle to identify out-of-scope queries. Our dataset and evaluation fill an important gap in the field, offering a way of more rigorously and realistically benchmarking text classification in task-driven dialog systems.


How to Build User Simulators to Train RL-based Dialog Systems

arXiv.org Artificial Intelligence

User simulators are essential for training reinforcement learning (RL) based dialog models. However, building a good user simulator that models real user behaviors is challenging. We propose a method of standardizing user simulator building that can be used by the community to compare dialog system quality using the same set of user simulators fairly. We present implementations of six user simulators trained with different dialog planning and generation methods. We then calculate a set of automatic metrics to evaluate the quality of these simulators both directly and indirectly. We also ask human users to assess the simulators directly and indirectly by rating the simulated dialogs and interacting with the trained systems. This paper presents a comprehensive evaluation framework for user simulator study and provides a better understanding of the pros and cons of different user simulators, as well as their impacts on the trained systems. 1 1 Introduction Reinforcement Learning has gained more and more attention in dialog system training because it treats the dialog planning as a sequential decision problem and focuses on long-term rewards (Su et al., 2017). However, RL requires interaction with the environment, and obtaining real human users to interact with the system is both time-consuming and labor-intensive. Therefore, building user simulators to interact with the system before deployment to real users becomes an economical choice (Williams et al., 2017; Li et al., 2016). But the performance of the user simulator has a direct impact on the trained RL policy.* Equal contribution. 1 The code and data are released at https://github.


Scalable and Accurate Dialogue State Tracking via Hierarchical Sequence Generation

arXiv.org Artificial Intelligence

For each dialogue turn, a DST module takes a user utterance and the dialogue history as input, and outputs a belief estimate of the dialogue state. Then a machine action is decided based on the dialogue state according to a dialogue policy module, after which a machine response is generated. Traditionally, a dialogue state consists of a set of requests and joint goals, both of which are represented by a set of slot-value pairs (e.g. ( request, phone), ( area, north), ( food, Japanese)) (Henderson et al., 2014). In a recently proposed multi-DST Models ITC NBT -CNN (Mrksic et al., 2017) O (mn) MD-DST (Rastogi et al., 2017) O (n) GLAD (Zhong et al., 2018) O (mn) StateNet PSI (Ren et al., 2018) O (n) TRADE (Wu et al., 2019) O (n) HyST (Goel et al., 2019) O (n) DSTRead (Gao et al., 2019) O (n) Table 1: The Inference Time Complexity (ITC) of previous DST models. The ITC is calculated based on how many times inference must be performed to complete a prediction of the belief state in a dialogue turn, where m is the number of values in a predefined ontology list and n is the number of slots.


Leveraging Just a Few Keywords for Fine-Grained Aspect Detection Through Weakly Supervised Co-Training

arXiv.org Machine Learning

User-generated reviews can be decomposed into fine-grained segments (e.g., sentences, clauses), each evaluating a different aspect of the principal entity (e.g., price, quality, appearance). Automatically detecting these aspects can be useful for both users and downstream opinion mining applications. Current supervised approaches for learning aspect classifiers require many fine-grained aspect labels, which are labor-intensive to obtain. And, unfortunately, unsupervised topic models often fail to capture the aspects of interest. In this work, we consider weakly supervised approaches for training aspect classifiers that only require the user to provide a small set of seed words (i.e., weakly positive indicators) for the aspects of interest. First, we show that current weakly supervised approaches do not effectively leverage the predictive power of seed words for aspect detection. Next, we propose a student-teacher approach that effectively leverages seed words in a bag-of-words classifier (teacher); in turn, we use the teacher to train a second model (student) that is potentially more powerful (e.g., a neural network that uses pre-trained word embeddings). Finally, we show that iterative co-training can be used to cope with noisy seed words, leading to both improved teacher and student models. Our proposed approach consistently outperforms previous weakly supervised approaches (by 14.1 absolute F1 points on average) in six different domains of product reviews and six multilingual datasets of restaurant reviews.


A Unified Neural Coherence Model

arXiv.org Machine Learning

Recently, neural approaches to coherence modeling have achieved state-of-the-art results in several evaluation tasks. However, we show that most of these models often fail on harder tasks with more realistic application scenarios. In particular, the existing models underperform on tasks that require the model to be sensitive to local contexts such as candidate ranking in conversational dialogue and in machine translation. In this paper, we propose a unified coherence model that incorporates sentence grammar, inter-sentence coherence relations, and global coherence patterns into a common neural framework. With extensive experiments on local and global discrimination tasks, we demonstrate that our proposed model outperforms existing models by a good margin, and establish a new state-of-the-art. 1 Introduction Coherence modeling involves building text analysis models that can distinguish a coherent text from incoherent ones. It has been a key problem in discourse analysis with applications in text generation, summarization, and coherence scoring. V arious linguistic theories have been proposed to formulate coherence, some of which have inspired development of many of the existing coherence models. These include the entity-based local models (Barzilay and Lapata, 2008; Elsner and Charniak, 2011b) that consider syntactic realization of entities in adjacent sentences, inspired by the Centering Theory (Grosz et al., 1995). Another line of research uses discourse relations between sentences to predict local coherence (Pitler and Nenkova, 2008; Lin et al., 2011). These methods are inspired by the discourse structure theories like Rhetorical Structure Theory (RST) (Mann and Thompson, 1988) that formalizes coherence in *Equal contribution terms of discourse relations.


Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset

arXiv.org Artificial Intelligence

A significant barrier to progress in data-driven approaches to building dialog systems is the lack of high quality, goal-oriented conversational data. To help satisfy this elementary requirement, we introduce the initial release of the Taskmaster-1 dataset which includes 13,215 task-based dialogs comprising six domains. Two procedures were used to create this collection, each with unique advantages. The first involves a two-person, spoken "Wizard of Oz" (WOz) approach in which trained agents and crowdsourced workers interact to complete the task while the second is "self-dialog" in which crowdsourced workers write the entire dialog themselves. We do not restrict the workers to detailed scripts or to a small knowledge base and hence we observe that our dataset contains more realistic and diverse conversations in comparison to existing datasets. We offer several baseline models including state of the art neural seq2seq architectures with benchmark performance as well as qualitative human evaluations. Dialogs are labeled with API calls and arguments, a simple and cost effective approach which avoids the requirement of complex annotation schema. The layer of abstraction between the dialog model and the service provider API allows for a given model to interact with multiple services that provide similar functionally. Finally, the dataset will evoke interest in written vs. spoken language, discourse patterns, error handling and other linguistic phenomena related to dialog system research, development and design.


Sentiment Analysis Using Machine Learning and Python

#artificialintelligence

Sentiment analysis is the process of computationally identifying and categorizing opinions expressed in a piece of text, especially in order to determine whether the writer's attitude towards a particular topic, product, etc. is positive, negative, or neutral. In this article, I will show you how to build your own program to determine if an article on a website is positive, negative, or neutral using the Python programming language. If you prefer not to read this post and would like a video representation of it, you can check out the YouTube Video below and the full code on my Github. It goes through everything in this article with a little more detail and will help make it easy for you to start programming your own article sentiment analysis program even if you don't have the programming language Python installed on your computer. Or you can use both as supplementary materials for learning!