Gupta, Rahul
Data augmentation for low resource sentiment analysis using generative adversarial networks
Gupta, Rahul
Sentiment analysis is a task that may suffer from a lack of data in certain cases, as the datasets are often generated and annotated by humans. In cases where data is inadequate for training discriminative models, generate models may aid training via data augmentation. Generative Adversarial Networks (GANs) are one such model that has advanced the state of the art in several tasks, including as image and text generation. In this paper, I train GAN models on low resource datasets, then use them for the purpose of data augmentation towards improving sentiment classifier generalization. Given the constraints of limited data, I explore various techniques to train the GAN models. I also present an analysis of the quality of generated GAN data as more training data for the GAN is made available. In this analysis, the generated data is evaluated as a test set (against a model trained on real data points) as well as a training set to train classification models. Finally, I also conduct a visual analysis by projecting the generated and the real data into a two-dimensional space using the t-Distributed Stochastic Neighbor Embedding (t-SNE) method.
Semi-supervised and Transfer learning approaches for low resource sentiment classification
Gupta, Rahul, Sahu, Saurabh, Espy-Wilson, Carol, Narayanan, Shrikanth
Sentiment classification involves quantifying the affective reaction of a human to a document, media item or an event. Although researchers have investigated several methods to reliably infer sentiment from lexical, speech and body language cues, training a model with a small set of labeled datasets is still a challenge. For instance, in expanding sentiment analysis to new languages and cultures, it may not always be possible to obtain comprehensive labeled datasets. In this paper, we investigate the application of semi-supervised and transfer learning methods to improve performances on low resource sentiment classification tasks. We experiment with extracting dense feature representations, pre-training and manifold regularization in enhancing the performance of sentiment classification systems. Our goal is a coherent implementation of these methods and we evaluate the gains achieved by these methods in matched setting involving training and testing on a single corpus setting as well as two cross corpora settings. In both the cases, our experiments demonstrate that the proposed methods can significantly enhance the model performance against a purely supervised approach, particularly in cases involving a handful of training data.
Adversarial Auto-encoders for Speech Based Emotion Recognition
Sahu, Saurabh, Gupta, Rahul, Sivaraman, Ganesh, AbdAlmageed, Wael, Espy-Wilson, Carol
Recently, generative adversarial networks and adversarial autoencoders have gained a lot of attention in machine learning community due to their exceptional performance in tasks such as digit classification and face recognition. They map the autoencoder's bottleneck layer output (termed as code vectors) to different noise Probability Distribution Functions (PDFs), that can be further regularized to cluster based on class information. In addition, they also allow a generation of synthetic samples by sampling the code vectors from the mapped PDFs. Inspired by these properties, we investigate the application of adversarial autoencoders to the domain of emotion recognition. Specifically, we conduct experiments on the following two aspects: (i) their ability to encode high dimensional feature vector representations for emotional utterances into a compressed space (with a minimal loss of emotion class discriminability in the compressed space), and (ii) their ability to regenerate synthetic samples in the original feature space, to be later used for purposes such as training emotion recognition classifiers. We demonstrate the promise of adversarial autoencoders with regards to these aspects on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) corpus and present our analysis.
DeepFix: Fixing Common C Language Errors by Deep Learning
Gupta, Rahul (Indian Institute of Science Bangalore) | Pal, Soham (Indian Institute of Science Bangalore) | Kanade, Aditya (Indian Institute of Science Bangalore) | Shevade, Shirish (Indian Institute of Science Bangalore)
The problem of automatically fixing programming errors is a very active research topic in software engineering. This is a challenging problem as fixing even a single error may require analysis of the entire program. In practice, a number of errors arise due to programmer's inexperience with the programming language or lack of attention to detail. We call these common programming errors. These are analogous to grammatical errors in natural languages. Compilers detect such errors, but their error messages are usually inaccurate. In this work, we present an end-to-end solution, called DeepFix, that can fix multiple such errors in a program without relying on any external tool to locate or fix them. At the heart of DeepFix is a multi-layered sequence-to-sequence neural network with attention which is trained to predict erroneous program locations along with the required correct statements. On a set of 6971 erroneous C programs written by students for 93 programming tasks, DeepFix could fix 1881 (27%) programs completely and 1338 (19%) programs partially.
Inferring object rankings based on noisy pairwise comparisons from multiple annotators
Gupta, Rahul, Narayanan, Shrikanth
Ranking a set of objects involves establishing an order allowing for comparisons between any pair of objects in the set. Oftentimes, due to the unavailability of a ground truth of ranked orders, researchers resort to obtaining judgments from multiple annotators followed by inferring the ground truth based on the collective knowledge of the crowd. However, the aggregation is often ad-hoc and involves imposing stringent assumptions in inferring the ground truth (e.g. majority vote). In this work, we propose Expectation-Maximization (EM) based algorithms that rely on the judgments from multiple annotators and the object attributes for inferring the latent ground truth. The algorithm learns the relation between the latent ground truth and object attributes as well as annotator specific probabilities of flipping, a metric to assess annotator quality. We further extend the EM algorithm to allow for a variable probability of flipping based on the pair of objects at hand. We test our algorithms on two data sets with synthetic annotations and investigate the impact of annotator quality and quantity on the inferred ground truth. We also obtain the results on two other data sets with annotations from machine/human annotators and interpret the output trends based on the data characteristics.
Joint Structured Models for Extraction from Overlapping Sources
Gupta, Rahul, Sarawagi, Sunita
We consider the problem of jointly training structured models for extraction from sources whose instances enjoy partial overlap. This has important applications like user-driven ad-hoc information extraction on the web. Such applications present new challenges in terms of the number of sources and their arbitrary pattern of overlap not seen by earlier collective training schemes applied on two sources. We present an agreement-based learning framework and alternatives within it to trade-off tractability, robustness to noise, and extent of agreement. We provide a principled scheme to discover low-noise agreement sets in unlabeled data across the sources. Through extensive experiments over 58 real datasets, we establish that our method of additively rewarding agreement over maximal segments of text provides the best trade-offs, and also scores over alternatives such as collective inference, staged training, and multi-view learning.
Generalized Collective Inference with Symmetric Clique Potentials
Gupta, Rahul, Sarawagi, Sunita, Diwan, Ajit A.
Collective graphical models exploit inter-instance associative dependence to output more accurate labelings. However existing models support very limited kind of associativity which restricts accuracy gains. This paper makes two major contributions. First, we propose a general collective inference framework that biases data instances to agree on a set of {\em properties} of their labelings. Agreement is encouraged through symmetric clique potentials. We show that rich properties leads to bigger gains, and present a systematic inference procedure for a large class of such properties. The procedure performs message passing on the cluster graph, where property-aware messages are computed with cluster specific algorithms. This provides an inference-only solution for domain adaptation. Our experiments on bibliographic information extraction illustrate significant test error reduction over unseen domains. Our second major contribution consists of algorithms for computing outgoing messages from clique clusters with symmetric clique potentials. Our algorithms are exact for arbitrary symmetric potentials on binary labels and for max-like and majority-like potentials on multiple labels. For majority potentials, we also provide an efficient Lagrangian Relaxation based algorithm that compares favorably with the exact algorithm. We present a 13/15-approximation algorithm for the NP-hard Potts potential, with runtime sub-quadratic in the clique size. In contrast, the best known previous guarantee for graphs with Potts potentials is only 1/2. We empirically show that our method for Potts potentials is an order of magnitude faster than the best alternatives, and our Lagrangian Relaxation based algorithm for majority potentials beats the best applicable heuristic -- ICM.