Discourse & Dialogue
Knowledge-based end-to-end memory networks
Ganhotra, Jatin, Polymenakos, Lazaros
End-to-end dialog systems have become very popular because they hold the promise of learning directly from human to human dialog interaction. Retrieval and Generative methods have been explored in this area with mixed results. A key element that is missing so far, is the incorporation of a-priori knowledge about the task at hand. This knowledge may exist in the form of structured or unstructured information. As a first step towards this direction, we present a novel approach, Knowledge based end-to-end memory networks (KB-memN2N), which allows special handling of named entities for goal-oriented dialog tasks. We present results on two datasets, DSTC6 challenge dataset and dialog bAbI tasks.
Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation
Zhao, Tiancheng, Lee, Kyusong, Eskenazi, Maxine
The encoder-decoder dialog model is one of the most prominent methods used to build dialog systems in complex domains. Yet it is limited because it cannot output interpretable actions as in traditional systems, which hinders humans from understanding its generation process. We present an unsupervised discrete sentence representation learning method that can integrate with any existing encoder-decoder dialog models for interpretable response generation. Building upon variational autoencoders (VAEs), we present two novel models, DI-VAE and DI-VST that improve VAEs and can discover interpretable semantics via either auto encoding or context predicting. Our methods have been validated on real-world dialog datasets to discover semantic representations and enhance encoder-decoder models with interpretable generation.
Understanding What is Behind Sentiment Analysis โ Part 2
Hint! Check Part I first, where we introduced a simple algorithm to analyze the sentiment of a given document. In this article we will talk about different modifications that might help us improve the performance of our classifier. To create a good classifier with the model described in Part I, we need a big and properly labelled corpus in order to compute a comprehensive word-sentiment occurrence table. In the training corpus, there should be statistically enough examples of each word in different contexts so the occurrences computed in the table can leverage a good approximation of their real probabilities (frequencies). There are several techniques aimed to reduce the dimensionality of the problem to make it more manageable.
Cross-domain Dialogue Policy Transfer via Simultaneous Speech-act and Slot Alignment
Mo, Kaixiang, Zhang, Yu, Yang, Qiang, Fung, Pascale
Dialogue policy transfer enables us to build dialogue policies in a target domain with little data by leveraging knowledge from a source domain with plenty of data. Dialogue sentences are usually represented by speech-acts and domain slots, and the dialogue policy transfer is usually achieved by assigning a slot mapping matrix based on human heuristics. However, existing dialogue policy transfer methods cannot transfer across dialogue domains with different speech-acts, for example, between systems built by different companies. Also, they depend on either common slots or slot entropy, which are not available when the source and target slots are totally disjoint and no database is available to calculate the slot entropy. To solve this problem, we propose a Policy tRansfer across dOMaIns and SpEech-acts (PROMISE) model, which is able to transfer dialogue policies across domains with different speech-acts and disjoint slots. The PROMISE model can learn to align different speech-acts and slots simultaneously, and it does not require common slots or the calculation of the slot entropy. Experiments on both real-world dialogue data and simulations demonstrate that PROMISE model can effectively transfer dialogue policies across domains with different speech-acts and disjoint slots.
ClassiNet -- Predicting Missing Features for Short-Text Classification
Bollegala, Danushka, Atanasov, Vincent, Maehara, Takanori, Kawarabayashi, Ken-ichi
The fundamental problem in short-text classification is \emph{feature sparseness} -- the lack of feature overlap between a trained model and a test instance to be classified. We propose \emph{ClassiNet} -- a network of classifiers trained for predicting missing features in a given instance, to overcome the feature sparseness problem. Using a set of unlabeled training instances, we first learn binary classifiers as feature predictors for predicting whether a particular feature occurs in a given instance. Next, each feature predictor is represented as a vertex $v_i$ in the ClassiNet where a one-to-one correspondence exists between feature predictors and vertices. The weight of the directed edge $e_{ij}$ connecting a vertex $v_i$ to a vertex $v_j$ represents the conditional probability that given $v_i$ exists in an instance, $v_j$ also exists in the same instance. We show that ClassiNets generalize word co-occurrence graphs by considering implicit co-occurrences between features. We extract numerous features from the trained ClassiNet to overcome feature sparseness. In particular, for a given instance $\vec{x}$, we find similar features from ClassiNet that did not appear in $\vec{x}$, and append those features in the representation of $\vec{x}$. Moreover, we propose a method based on graph propagation to find features that are indirectly related to a given short-text. We evaluate ClassiNets on several benchmark datasets for short-text classification. Our experimental results show that by using ClassiNet, we can statistically significantly improve the accuracy in short-text classification tasks, without having to use any external resources such as thesauri for finding related features.
How to Perform Sentiment analysis in Excel Without Writing Code?
We recently announced a new version of Excel Add-in which lets you perform state-of-the-art text analysis capabilities from the comforts of your spreadsheets without writing a single line of code. The add-in has been received very well by users working across different industry verticals like Market Research, Software, Consumer Goods, Education, etc. solving a variety of use-cases. Sentiment analysis has been the most used function of our Excel add-in closely followed by Emotion detection. Many of our users use sentiment analysis in Excel to quickly and accurately analyze the responses of their open-ended surveys, online chatter around their product/service or to analyze product reviews from e-commerce sites. In this blog post, we will discuss how to use the function Sentiment Analysis in Excel Add-in to do text analytics for any type of content.
Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems
Xiao, Zihao, Chen, Jianfei, Zhu, Jun
Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent semantic indexing (pLSI) and latent Dirichlet allocation (LDA). By now, their training is implemented on general purpose computers (GPCs), which are flexible in programming but energy-consuming. Towards low-energy implementations, this paper investigates their training on an emerging hardware technology called the neuromorphic multi-chip systems (NMSs). NMSs are very effective for a family of algorithms called spiking neural networks (SNNs). We present three SNNs to train topic models. The first SNN is a batch algorithm combining the conventional collapsed Gibbs sampling (CGS) algorithm and an inference SNN to train LDA. The other two SNNs are online algorithms targeting at both energy- and storage-limited environments. The two online algorithms are equivalent with training LDA by using maximum-a-posterior estimation and maximizing the semi-collapsed likelihood, respectively. They use novel, tailored ordinary differential equations for stochastic optimization. We simulate the new algorithms and show that they are comparable with the GPC algorithms, while being suitable for NMS implementation. We also propose an extension to train pLSI and a method to prune the network to obey the limited fan-in of some NMSs.
State Tracking Networks for Dialog State Tracking
Wang, Xuguang (Baidu Research) | Cheng, Xingyi (Baidu Research) | Zhou, Jie (Baidu Research) | Xu, Wei (Baidu Research)
Dialog state tracking is to accurately infer a compact representation of the dialog status up to the current turn, it needs to summarize all the dialog history information and user's goals. In a successful spoken dialog system, dialog state tracker is one of the most important components of the pipelines. Yet until recently, there are no general, flexible, accurate and truly end to end dialog state tracking models. In this paper, we propose a novel model named state tracking networks that can perform dialog state tracking in a natural efficient and elegant way. It uses an explicit gate to model the state updating mechanism and can be trained end to end in a deterministic manner using standard backpropagation techniques or stochastically by reinforcement learning. Our model can both deal with ASR and text input without any modification. We perform experiments on the Second Dialog State Tracking Challenge dataset(DSTC2) and get performance matching the state-of-the-art models. Furthermore, the qualitative analysis reveals that the gating mechanism learned by our model agree well with intuition.
Automated Classification of Text Sentiment
Dufourq, Emmanuel, Bassett, Bruce A.
The ability to identify sentiment in text, referred to as sentiment analysis, is one which is natural to adult humans. This task is, however, not one which a computer can perform by default. Identifying sentiments in an automated, algorithmic manner will be a useful capability for business and research in their search to understand what consumers think about their products or services and to understand human sociology. Here we propose two new Genetic Algorithms (GAs) for the task of automated text sentiment analysis. The GAs learn whether words occurring in a text corpus are either sentiment or amplifier words, and their corresponding magnitude. Sentiment words, such as 'horrible', add linearly to the final sentiment. Amplifier words in contrast, which are typically adjectives/adverbs like 'very', multiply the sentiment of the following word. This increases, decreases or negates the sentiment of the following word. The sentiment of the full text is then the sum of these terms. This approach grows both a sentiment and amplifier dictionary which can be reused for other purposes and fed into other machine learning algorithms. We report the results of multiple experiments conducted on large Amazon data sets. The results reveal that our proposed approach was able to outperform several public and/or commercial sentiment analysis algorithms.
Sentiment Analysis of Code-Mixed Languages leveraging Resource Rich Languages
Choudhary, Nurendra, Singh, Rajat, Bindlish, Ishita, Shrivastava, Manish
Code-mixed data is an important challenge of natural language processing because its characteristics completely vary from the traditional structures of standard languages. In this paper, we propose a novel approach called Sentiment Analysis of Code-Mixed Text (SACMT) to classify sentences into their corresponding sentiment - positive, negative or neutral, using contrastive learning. We utilize the shared parameters of siamese networks to map the sentences of code-mixed and standard languages to a common sentiment space. Also, we introduce a basic clustering based preprocessing method to capture variations of code-mixed transliterated words. Our experiments reveal that SACMT outperforms the state-of-the-art approaches in sentiment analysis for code-mixed text by 7.6% in accuracy and 10.1% in F-score.