Goto

Collaborating Authors

 Discourse & Dialogue


Sentiment classification on node level for RNTN and SVN • /r/MachineLearning

@machinelearnbot

I have question regarding this paper (http://nlp.stanford.edu/ In the paper there are some results on page 7 in Table 1. There are results for All and Root. For the results All they use the results of all nodes of the tree. For Root they use the results on sentence level.


Conditional Generation and Snapshot Learning in Neural Dialogue Systems

arXiv.org Machine Learning

Recently a variety of LSTM-based conditional language models (LM) have been applied across a range of language generation tasks. In this work we study various model architectures and different ways to represent and aggregate the source information in an end-to-end neural dialogue system framework. A method called snapshot learning is also proposed to facilitate learning from supervised sequential signals by applying a companion cross-entropy objective function to the conditioning vector. The experimental and analytical results demonstrate firstly that competition occurs between the conditioning vector and the LM, and the differing architectures provide different trade-offs between the two. Secondly, the discriminative power and transparency of the conditioning vector is key to providing both model interpretability and better performance. Thirdly, snapshot learning leads to consistent performance improvements independent of which architecture is used.


David Blei - Wikipedia, the free encyclopedia

#artificialintelligence

David Blei is a Professor in the Statistics and Computer Science departments at Columbia University. Prior to fall 2014 he was an Associate Professor in the Department of Computer Science at Princeton University. His work is primarily in machine learning. His research interests include topic models and he was one of the original developers of latent Dirichlet allocation. As of November 11, 2015, his publications have been cited 31,135 times, giving him an h-index of 53.[1]


Brand Image, Sentiment Analysis and Social Media

@machinelearnbot

Analyzing sentiments is a very subjective exercise. He has his own software company which is a mid-sized software one and that which was doing fairly well and at one time, he tried to analyze the broader sentiment about the brand value of his company on the open market. The overall brand sentiment turned out to be negative and even more surprising was the fact that it leaned towards the most negative scale. This surprised him because the other parameters that his HR partners provided him were painting a contrasting picture- the attrition rate was low, the employee engagement survey produced positive results etc. Then he did a deep dive into the feedback content and realized that almost all of the comments were negative and that the people who posted feedback were all disgruntled employees and not many employees who were happy posted any kind of feedback on any social forum. They were too busy with their work and adding more value to the organization.


Sentiment Analysis with Talend & Stanford CoreNLP Datalytyx

@machinelearnbot

In my previous blog, I showed you how to integrate Stanford CoreNLP with Talend using a simple example. In this post I'll show you how to modify that code in order to make the most of Talend's strengths as a data integration tool. Below is a Talend job I have built to read some tweets from a database (see this blog article for information on how to retrieve tweets with Talend), run the text through the CoreNLP sentiment analysis code, and then write tweets back to the database with the addition of the sentiment. In this particular example, the text to be analysed are tweets coming from a database. However, the same job will work with any string input.


Mental Health Alerts via Facebook? - The Crux

#artificialintelligence

Every day, 730,000 comments and 420 billion statuses are posted on Facebook, 500 billion 140-character tweets are posted and 430,000 hours of new video is uploaded to YouTube. The Internet is a goldmine of data just waiting to be analyzed. Ever since social media crept deeper and deeper into our daily lives, governments and advertisers have been utilizing this data for myriad purposes. Now, a team of researchers at the University of Ottawa, University of Alberta and the Université de Montpellier in France is examining ways to use social media data to detect and monitor people who are potentially at risk of mental health issues. Using computer algorithms, the team will apply social web mining and "sentiment analysis methods" to troves of data generated through social media to detect at-risk individuals. Sentiment analysis is the process of identifying and categorizing opinions expressed in text through a computer program.


Provable Algorithms for Inference in Topic Models

arXiv.org Machine Learning

Recently, there has been considerable progress on designing algorithms with provable guarantees -- typically using linear algebraic methods -- for parameter learning in latent variable models. But designing provable algorithms for inference has proven to be more challenging. Here we take a first step towards provable inference in topic models. We leverage a property of topic models that enables us to construct simple linear estimators for the unknown topic proportions that have small variance, and consequently can work with short documents. Our estimators also correspond to finding an estimate around which the posterior is well-concentrated. We show lower bounds that for shorter documents it can be information theoretically impossible to find the hidden topics. Finally, we give empirical results that demonstrate that our algorithm works on realistic topic models. It yields good solutions on synthetic data and runs in time comparable to a {\em single} iteration of Gibbs sampling.


Combinatorial Topic Models using Small-Variance Asymptotics

arXiv.org Machine Learning

Topic models have emerged as fundamental tools in unsupervised machine learning. Most modern topic modeling algorithms take a probabilistic view and derive inference algorithms based on Latent Dirichlet Allocation (LDA) or its variants. In contrast, we study topic modeling as a combinatorial optimization problem, and propose a new objective function derived from LDA by passing to the small-variance limit. We minimize the derived objective by using ideas from combinatorial optimization, which results in a new, fast, and high-quality topic modeling algorithm. In particular, we show that our results are competitive with popular LDA-based topic modeling approaches, and also discuss the (dis)similarities between our approach and its probabilistic counterparts.


Toward a general, scaleable framework for Bayesian teaching with applications to topic models

arXiv.org Machine Learning

Machines, not humans, are the world's dominant knowledge accumulators but humans remain the dominant decision makers. Interpreting and disseminating the knowledge accumulated by machines requires expertise, time, and is prone to failure. The problem of how best to convey accumulated knowledge from computers to humans is a critical bottleneck in the broader application of machine learning. We propose an approach based on human teaching where the problem is formalized as selecting a small subset of the data that will, with high probability, lead the human user to the correct inference. This approach, though successful for modeling human learning in simple laboratory experiments, has failed to achieve broader relevance due to challenges in formulating general and scalable algorithms. We propose general-purpose teaching via pseudo-marginal sampling and demonstrate the algorithm by teaching topic models. Simulation results show our sampling-based approach: effectively approximates the probability where ground-truth is possible via enumeration, results in data that are markedly different from those expected by random sampling, and speeds learning especially for small amounts of data. Application to movie synopsis data illustrates differences between teaching and random sampling for teaching distributions and specific topics, and demonstrates gains in scalability and applicability to real-world problems.


Integrating Stanford CoreNLP with Talend Studio Datalytyx

@machinelearnbot

In my previous blog Twitter Sentiment Analysis using Talend, I showed how to extract tweets from Twitter using Talend and then how to do some basic sentiment analysis on those tweets. In this post, I will introduce the Stanford CoreNLP toolkit and show how to integrate it with Talend to perform various NLP (Natural Language Processing) analyses including sentiment analysis. Previously I had managed to perform some basic sentiment analysis on tweets. However, I'd noticed a major flaw with my technique: the method I was using would take each word in a sentence and average the sentiment score of each word. I explain the issue in more detail in my original post, but to give you a flavour of it, I'll show you some examples of correct/incorrect sentiment identification that would result from my previous method: This is incorrect as it should be fairly obvious that this sentence carries negative sentiment.