We propose a dynamic joint sentiment-topic model (dJST) which allows the detection and tracking of views of current and recurrent interests and shifts in topic and sentiment. Both topic and sentiment dynamics are captured by assuming that the current sentiment-topic specific word distributions are generated according to the word distributions at previous epochs. We derive efficient online inference procedures to sequentially update the model with newly arrived data and show the effectiveness of our proposed model on the Mozilla add-on reviews crawled between 2007 and 2011.
Location-based social sites, such as Foursquare or Yelp, are gaining increasing popularity. These sites allow users to check in at venues and leave a short commentary in the form of a micro-review. Micro-reviews are rich in content as they offer a distilled and concise account of user experience. In this paper we consider the problem of predicting the topic of a micro-review by a user who visits a new venue. Such a prediction can help users make informed decisions, and also help venue owners personalize users’ experiences. However, topic modeling for micro-reviews is particularly difficult, due to their short and fragmented nature. We address this issue using pooling strategies, which aggregate micro-reviews at the venue or user level, and we propose novel probabilistic models based on Latent Dirichlet Allocation (LDA) for extracting the topics related to a user-venue pair. Our best topic model integrates influences from both venue inherent properties and user preferences, considering at the same the sentiment orientation of the users. Experimental results on real datasets demonstrate the superiority of this model compared to simpler models and previous work; they also show that venue-inherent properties have higher influences on the topics of micro-reviews.
With the development of Web 2.0, sentiment analysis has now become a popular research problem to tackle. Recently, topic models have been introduced for the simultaneous analysis for topics and the sentiment in a document. These studies, which jointly model topic and sentiment, take the advantage of the relationship between topics and sentiment, and are shown to be superior to traditional sentiment analysis tools. However, most of them make the assumption that, given the parameters, the sentiments of the words in the document are all independent. In our observation, in contrast, sentiments are expressed in a coherent way. The local conjunctive words, such as “and” or “but”, are often indicative of sentiment transitions. In this paper, we propose a major departure from the previous approaches by making two linked contributions. First, we assume that the sentiments are related to the topic in the document, and put forward a joint sentiment and topic model, i.e. Sentiment-LDA. Second, we observe that sentiments are dependent on local context. Thus, we further extend the Sentiment-LDA model to Dependency-Sentiment-LDA model by relaxing the sentiment independent assumption in Sentiment-LDA. The sentiments of words are viewed as a Markov chain in Dependency-Sentiment-LDA. Through experiments, we show that exploiting the sentiment dependency is clearly advantageous, and that the Dependency-Sentiment-LDA is an effective approach for sentiment analysis.
We analyze patient reviews of doctors using a novel probabilistic joint model of topic and sentiment based on factorial LDA (Paul and Dredze 2012). We leverage this model to exploit a small set of previously annotated reviews to automatically analyze the topics and sentiment latent in over 50,000 online reviews of physicians (and we make this dataset publicly available). The proposed model outperforms baseline models for this task with respect to model perplexity and sentiment classification. We report the most representative words with respect to positive and negative sentiment along three clinical aspects, thus complementing existing qualitative work exploring patient reviews of physicians.
To help users quickly understand the major opinions from massive online reviews, it is important to automatically reveal the latent structure of the aspects, sentiment polarities, and the association between them. However, there is little work available to do this effectively. In this paper, we propose a hierarchical aspect sentiment model (HASM) to discover a hierarchical structure of aspect-based sentiments from unlabeled online reviews. In HASM, the whole structure is a tree. Each node itself is a two-level tree, whose root represents an aspect and the children represent the sentiment polarities associated with it. Each aspect or sentiment polarity is modeled as a distribution of words. To automatically extract both the structure and parameters of the tree, we use a Bayesian nonparametric model, recursive Chinese Restaurant Process (rCRP), as the prior and jointly infer the aspect-sentiment tree from the review texts. Experiments on two real datasets show that our model is comparable to two other hierarchical topic models in terms of quantitative measures of topic trees. It is also shown that our model achieves better sentence-level classification accuracy than previously proposed aspect-sentiment joint models.