User generated content is extremely valuable for mining market intelligence because it is unsolicited. We study the problem of analyzing users' sentiment and opinion in their blog, message board, etc. posts with respect to topics expressed as a search query. In the scenario we consider the matches of the search query terms are expanded through coreference and meronymy to produce a set of mentions. The mentions are contextually evaluated for sentiment and their scores are aggregated (using a data structure we introduce call the sentiment propagation graph) to produce an aggregate score for the input entity. An extremely crucial part in the contextual evaluation of individual mentions is finding which sentiment expressions are semantically related to (target) which mentions --- this is the focus of our paper. We present an approach where potential target mentions for a sentiment expression are ranked using supervised machine learning (Support Vector Machines) where the main features are the syntactic configurations (typed dependency paths) connecting the sentiment expression and the mention. We have created a large English corpus of product discussions blogs annotated with semantic types of mentions, coreference, meronymy and sentiment targets. The corpus proves that coreference and meronymy are not marginal phenomena but are really central to determining the overall sentiment for the top-level entity. We evaluate a number of techniques for sentiment targeting and present results which we believe push the current state-of-the-art.
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Semantic word spaces have been very useful but cannot express the meaning of longer phrases in a principled way. Further progress towards understanding compositionality in tasks such as sentiment detection requires richer supervised training and evaluation resources and more powerful models of composition. To remedy this, we introduce a Sentiment Treebank. It includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality. To address them, we introduce the Recursive Neural Tensor Network.
George, Clint Pazhayidam (University of Florida) | Puri, Sahil (University of Florida) | Wang, Daisy Zhe (University of Florida) | Wilson, Joseph N. (University of Florida) | Hamilton, William F. (University of Florida)
Electronic discovery is an interesting subproblem of information retrieval in which one identifies documents that are potentially relevant to issues and facts of a legal case from an electronically stored document collection (a corpus). In this paper, we consider representing documents in a topic space using the well-known topic models such as latent Dirichlet allocation and latent semantic indexing, and solving the information retrieval problem via finding document similarities in the topic space rather doing it in the corpus vocabulary space. We also develop an iterative SMART ranking and categorization framework including human-in-the-loop to label a set of seed (training) documents and using them to build a semi-supervised binary document classification model based on Support Vector Machines. To improve this model, we propose a method for choosing seed documents from the whole population via an active learning strategy. We report the results of our experiments on a real dataset in the electronic discovery domain.
Conversational interfaces with computers have been the talk of tech since the days of Star Trek. Mostly associated with voice response, frustrating experiences interacting with Siri, chatbots, or the interactive voice response (IVR) systems of call centers reveal what a long slog it's been for getting computers to understand natural language, regardless of whether it's in the form of voice or text. But it took the Amazon Echo's Alexa, which was designed as a conversational voice to Amazon's retail and entertainment services, to show that natural language interfaces could actually perform useful services. When we saw SAS founder Dr. James Goodnight demonstrate how Alexa could be used to query SAS Visual Analytics, we thought that was pretty cool. But when you look at this video, you'll realize that Alexa has only been taught a few things and has a long way to go before it will replace your keyboard or touchpad.