Goto

Collaborating Authors

 Discourse & Dialogue


Dialog-based Language Learning

Neural Information Processing Systems

A long-term goal of machine learning research is to build an intelligent dialog agent. Most research in natural language understanding has focused on learning from fixed training sets of labeled data, with supervision either at the word level (tagging, parsing tasks) or sentence level (question answering, machine translation). This kind of supervision is not realistic of how humans learn, where language is both learned by, and used for, communication. In this work, we study dialog-based language learning, where supervision is given naturally and implicitly in the response of the dialog partner during the conversation. We study this setup in two domains: the bAbI dataset of (Weston et al., 2015) and large-scale question answering from (Dodge et al., 2015). We evaluate a set of baseline learning strategies on these tasks, and show that a novel model incorporating predictive lookahead is a promising approach for learning from a teacher's response. In particular, a surprising result is that it can learn to answer questions correctly without any reward-based supervision at all.


Partial Membership Latent Dirichlet Allocation

arXiv.org Machine Learning

Topic models (e.g., pLSA, LDA, sLDA) have been widely used for segmenting imagery. However, these models are confined to crisp segmentation, forcing a visual word (i.e., an image patch) to belong to one and only one topic. Yet, there are many images in which some regions cannot be assigned a crisp categorical label (e.g., transition regions between a foggy sky and the ground or between sand and water at a beach). In these cases, a visual word is best represented with partial memberships across multiple topics. To address this, we present a partial membership latent Dirichlet allocation (PM-LDA) model and an associated parameter estimation algorithm. This model can be useful for imagery where a visual word may be a mixture of multiple topics. Experimental results on visual and sonar imagery show that PM-LDA can produce both crisp and soft semantic image segmentations; a capability previous topic modeling methods do not have.


Sentiment Analysis of Movie Reviews (3): doc2vec

@machinelearnbot

This is the last โ€“ for now โ€“ installment of my mini-series on sentiment analysis of the Stanford collection of IMDB reviews (originally published on recurrentnull.wordpress.com). So far, we've had a look at classical bag-of-words models and word vectors (word2vec). We saw that from the classifiers used, logistic regression performed best, be it in combination with bag-of-words or word2vec. We also saw that while the word2vec model did in fact model semantic dimensions, it was less successful for classification than bag-of-words, and we explained that by the averaging of word vectors we had to perform to obtain input features on review (not word) level. So the question now is: How would distributed representations perform if we did not have to throw away information by averaging word vectors?


Opinion Mining - Sentiment Analysis and Beyond

@machinelearnbot

So you report with reasonable accuracies what the sentiment about a particular brand or product is. After publishing this report, your client comes back to you and says "Hey this is good. Now can you tell me ways in which I can convert the negative sentiments into positive sentiments?" โ€“ Sentiment Analysis stops there and we enter the realms of Opinion Mining. Opinion Mining is about having a deeper understanding of the review that was written. Typically, a detailed review will not just have a sentiment attached to it. It will have information and valuable feedback that can literally help to build the next strategy.


Sentiment Analysis of Movie Reviews (1):Bag-of-Words Models

@machinelearnbot

Imagine I show you a book review, on amazon.com, Imagine I hide the number of stars, โ€“ all you get to see is the number of stars. And now I'm asking you, that review, is it good or bad? Well, it should be easy, for humans (although depending on the input there can be lots of disagreement between humans, too.) But if you want to do it automatically, it turns out to be surprisingly difficult.


Sentiment Analysis of Movie Reviews (2): word2vec

@machinelearnbot

This is the continuation of my mini-series on sentiment analysis of movie reviews, which originally appeared on recurrentnull.wordpress.com. Last time, we had a look at how well classical bag-of-words models worked for classification of the Stanford collection of IMDB reviews. As it turned out, the "winner" was Logistic Regression, using both unigrams and bigrams for classification. The best classification accuracy obtained was .89 So, bag-of-words models may be surprisingly successful, but they are limited in what they can do.


A Simple Approach to Multilingual Polarity Classification in Twitter

arXiv.org Machine Learning

Recently, sentiment analysis has received a lot of attention due to the interest in mining opinions of social media users. Sentiment analysis consists in determining the polarity of a given text, i.e., its degree of positiveness or negativeness. Traditionally, Sentiment Analysis algorithms have been tailored to a specific language given the complexity of having a number of lexical variations and errors introduced by the people generating content. In this contribution, our aim is to provide a simple to implement and easy to use multilingual framework, that can serve as a baseline for sentiment analysis contests, and as starting point to build new sentiment analysis systems. We compare our approach in eight different languages, three of them have important international contests, namely, SemEval (English), TASS (Spanish), and SENTIPOLC (Italian). Within the competitions our approach reaches from medium to high positions in the rankings; whereas in the remaining languages our approach outperforms the reported results.


How to create a Twitter Sentiment Analysis using R and Shiny

@machinelearnbot

I will show you how to create a simple application in R & Shiny to perform Twitter Sentiment Analysis in real-time. First, I create a Shiny Project. Then, in the ui.R file, I put this code: Here, I will show a title, the current time, a table with Twitter user name, a bar graph and wordclouds. It's a really simply code, not complex at all. The purpose of it is just for testing and so you guys can practice R language.


SuperBowl XLIX in Tweets: Sentiment Analysis of 4 Million Tweets

@machinelearnbot

This blog was originally published on our Text Analysis blog, the blog post set out to analyze and visualize 4 million tweets collected during Superbowl XLIX. Not surprisingly, Superbowl XLIX generated a huge amount of chatter on social networks with Twitter Estimating that over 28.4 million posts made with terms relating to the Superbowl. At AYLIEN, we collected just under 4 million Tweets from Hashtags, Handles and Keywords we were monitoring. To keep our sample clean, we removed any reTweets and spam from the Tweets collected and only worked with those Tweets that were written in English. We were left with about 3.5 million Tweets to play with.


Whatsapp chat sentiment analysis in R

#artificialintelligence

Want to watch this again later? Report Need to report the video? Report Need to report the video? Need to report the video? This feature is not available right now.