In this paper, we propose causality as a unified framework to explain query answers and non-answers, thus generalizing and extending several previously proposed approaches of provenance and missing query result explanations. We develop our framework starting from the well-studied definition of actual causes by Halpern and Pearl. After identifying some undesirable characteristics of the original definition, we propose functional causes as a refined definition of causality with several desirable properties. These properties allow us to apply our notion of causality in a database context and apply it uniformly to define the causes of query results and their individual contributions in several ways: (i) we can model both provenance as well as non-answers, (ii) we can define explanations as either data in the input relations or relational operations in a query plan, and (iii) we can give graded degrees of responsibility to individual causes, thus allowing us to rank causes. In particular, our approach allows us to explain contributions to relational aggregate functions and to rank causes according to their respective responsibilities. We give complexity results and describe polynomial algorithms for evaluating causality in tractable cases. Throughout the paper, we illustrate the applicability of our framework with several examples. Overall, we develop in this paper the theoretical foundations of causality theory in a database context.
The Expressive Intelligence Studio is developing a new approach to freeform conversational interaction in playable media that combines dialogue management, natural language generation (NLG), and natural language understanding. In this paper, we present our method for dialogue generation, which has been fully implemented in a game we are developing called Talk of the Town . Eschewing a traditional NLG pipeline, we take up a novel approach that combines human language expertise with computer generativity. Specifically, this method utilizes a tool that we have developed for authoring context-free grammars (CFGs) whose productions come packaged with explicit metadata. Instead of terminally expanding top-level symbols — the conventional way of generating from a CFG — we employ an unusual middle-out procedure that targets mid-level symbols and traverses the grammar by both forward chaining and backward chaining, expanding symbols conditionally by testing against the current game state. In this paper, we present our method, discuss a series of associated authoring patterns, and situate our approach against the few earlier projects in this area.
Kempter, Renato (Swiss Federal Institute of Technology Lausanne (EPFL)) | Sintsova, Valentina (Swiss Federal Institute of Technology Lausanne (EPFL)) | Musat, Claudiu (Swiss Federal Institute of Technology Lausanne (EPFL)) | Pu, Pearl (Swiss Federal Institute of Technology Lausanne (EPFL))
Spectators are increasingly using social platforms to express their opinions and share their emotions during big public events. Those reactions reveal the subjective perception of the event and extend its understanding. This has motivated us to develop a system to explore and visualize volume, patterns, and trends of user sentiments as they evolve over time. Previous work in sentiment analysis and opinion mining has addressed these issues. But the majority of them distinguish only two polarity categories, leaving a more detailed and insightful analysis to be desired. In this paper, we suggest using a fine-grained, multi-category emotion model to classify and visualize users' emotional reactions in public events. We describe EmotionWatch, a tool that constructs visual summaries of public emotions, and apply it to the 2012 Olympics as a test case. We report findings from a user study evaluating the usability of the tool and validating the emotion model. Results show that users prefer a more detailed inspection of public emotions over the simplified analysis. Despite its complexity, users were able to effectively grasp, understand, and interpret the emotional reactions using EmotionWatch. The same user study also pointed out few design improvements for the future development of analogous systems.
Much of opinion mining research focuses on product reviews because reviews are opinion-rich and contain little irrelevant information. However, this cannot be said about online discussions and comments. In such postings, the discussions can get highly emotional and heated with many emotional statements, and even personal attacks. As a result, many of the postings and sentences do not express positive or negative opinions about the topic being discussed. To find people’s opinions on a topic and its different aspects, which we call evaluative opinions, those irrelevant sentences should be removed. The goal of this research is thus to identify evaluative opinion sentences. A novel unsupervised approach is proposed to solve the problem, and our experimental results show that it performs well.
This blog was originally published on our Text Analysis blog, the blog post set out to analyze and visualize 4 million tweets collected during Superbowl XLIX. Not surprisingly, Superbowl XLIX generated a huge amount of chatter on social networks with Twitter Estimating that over 28.4 million posts made with terms relating to the Superbowl. At AYLIEN, we collected just under 4 million Tweets from Hashtags, Handles and Keywords we were monitoring. To keep our sample clean, we removed any reTweets and spam from the Tweets collected and only worked with those Tweets that were written in English. We were left with about 3.5 million Tweets to play with.