Goto

Collaborating Authors

 Industry


Exacting Social Events for Tweets Using a Factor Graph

AAAI Conferences

Social events are events that occur between people where at least one person is aware of the other and of the event taking place. Extracting social events can play an important role in a wide range of applications, such as the construction of social network. In this paper, we introduce the task of social event extraction for tweets, an important source of fresh events. One main challenge is the lack of information in a single tweet, which is rooted in the short and noise-prone nature of tweets. We propose to collectively extract social events from multiple similar tweets using a novel factor graph, to harvest the redundance in tweets, i.e., the repeated occurrences of a social event in several tweets. We evaluate our method on a human annotated data set, and show that it outperforms all baselines, with an absolute gain of 21% in F1.


Emoticon Smoothed Language Models for Twitter Sentiment Analysis

AAAI Conferences

Twitter sentiment analysis (TSA) has become a hot research topic in recent years. The goal of this task is to discover the attitude or opinion of the tweets, which is typically formulated as a machine learning based text classification problem. Some methods use manually labeled data to train fully supervised models, while others use some noisy labels, such as emoticons and hashtags, for model training. In general, we can only get a limited number of training data for the fully supervised models because it is very labor-intensive and time-consuming to manually label the tweets. As for the models with noisy labels, it is hard for them to achieve satisfactory performance due to the noise in the labels although it is easy to get a large amount of data for training. Hence, the best strategy is to utilize both manually labeled data and noisy labeled data for training. However, how to seamlessly integrate these two different kinds of data into the same learning framework is still a challenge. In this paper, we present a novel model, called emoticon smoothed language model (ESLAM), to handle this challenge. The basic idea is to train a language model based on the manually labeled data, and then use the noisy emoticon data for smoothing. Experiments on real data sets demonstrate that ESLAM can effectively integrate both kinds of data to outperform those methods using only one of them.


Modeling Textual Cohesion for Event Extraction

AAAI Conferences

Event extraction systems typically locate the role fillers for an event by analyzing sentences in isolation and identifying each role filler independently of the others. We argue that more accurate event extraction requires a view of the larger context to decide whether an entity is related to a relevant event. We propose a bottom-up approach to event extraction that initially identifies candidate role fillers independently and then uses that information as well as discourse properties to model textual cohesion. The novel component of the architecture is a sequentially structured sentence classifier that identifies event-related story contexts. The sentence classifier uses lexical associations and discourse relations across sentences, as well as domain-specific distributions of candidate role fillers within and across sentences. This approach yields state-of-the-art performance on the MUC-4 data set, achieving substantially higher precision than previous systems.


Cruising with a Battery-Powered Vehicle and Not Getting Stranded

AAAI Conferences

The main hindrance to a widespread market penetration of battery-powered electric vehicles (BEVs) has been their limited energy reservoir resulting in cruising ranges of few hundred kilometers unless one allows for recharging or switching of depleted batteries during a trip. Unfortunately, recharging typically takes several hours and battery switch stations providing fully recharged batteries are still quite rare – certainly not as widespread as ordinary gas stations. For not getting stranded with an empty battery, going on a BEV trip requires some planning ahead taking into account energy characteristics of the BEV as well as available battery switch stations. In this paper we consider very basic, yet fundamental problems for E-Mobility: Can I get from A to B and back with my BEV without recharging in between? Can I get from A to B when allowed to recharge? How can I minimize the number of battery switches when going from A to B? We provide efficient and mathematically sound algorithms for these problems that allow for the energy-aware planning of trips.


Automatically Generating Algebra Problems

AAAI Conferences

We propose computer-assisted techniques for helping with pedagogy in Algebra. In particular, given a proof problem p (of the form “Left-hand-side-term = Right-hand-side-term”), we show how to automatically generate problems that are similar to p. We believe that such a tool can be used by teachers in making examinations where they need to test students on problems similar to what they taught in class, and by students in generating practice problems tailored to their specific needs. Our first insight is that we can generalize p syntactically to a query Q that implicitly represents a set of problems [[Q]] (which includes p). Our second insight is that we can explore the space of problems [[Q]] automatically, use classical results from polynomial identity testing to generate only those problems in [[Q]] that are correct, and then use pruning techniques to generate only unique and interesting problems. Our third insight is that with a small amount of manual tuning on the query Q, the user can interactively guide the computer to generate problems of interest to her. We present the technical details of the above mentioned steps, and also describe a tool where these steps have been implemented. We also present an empirical evaluation on a wide variety of problems from various sub-fields of algebra including polynomials, trigonometry, calculus, determinants etc. Our tool is able to generate a rich corpus of similar problems from each given problem; while some of these similar problems were already present in the textbook, several were new!


Unsupervised Detection of Music Boundaries by Time Series Structure Features

AAAI Conferences

In music, boundaries may occur because scientific domains, including artificial intelligence (Keogh of multiple changes, such as a change in instrumentation, 2011). Research on time series has a long tradition, but a change in harmony, or a change in tempo. The seminal its application to real-world datasets requires to cope with approach by Foote (2000) estimated these changes by new relevant issues, such as the multiple dimensionality of means of a so-called novelty curve, obtained by sliding a data or limited computational resources. Specifically, dealing short-time checkerboard kernel over the diagonal of a selfsimilarity with large-scale data, (1) algorithms must be efficient, matrix of pairwise sample comparisons. Works inspired i.e. they have to scale, (2) supervised approaches may become by Foote's approach explicitly make use of the concept unfeasible, and (3) solutions must use general techniques, of novelty curves (Paulus et al. 2010). Other musictargeted i.e. they should be as independent of the domain as approaches exploit homogeneities in a time series possible (see Mueen and Keogh 2010 for a more detailed by employing more refined techniques like hidden Markov discussion).


HyperPlay: A Solution to General Game Playing with Imperfect Information

AAAI Conferences

General Game Playing is the design of AI systems able to understand the rules of new games and to use such descriptions to play those games effectively. Games with imperfectinformation have recently been added as a new challenge forexisting general game-playing systems. The HyperPlay technique presents a solution to this challenge by maintaining a collection of models of the true game as a foundation for reasoning, and move selection. The technique provides existing game players with a bolt-on solution to convert from perfect-information games to imperfect-information games. In this paper we describe the HyperPlay technique, show how it was adapted for use with a Monte Carlo decision making process and give experimental results for its performance.


Identifying Adverse Drug Events by Relational Learning

AAAI Conferences

The pharmaceutical industry, consumer protection groups, users of medications and government oversight agencies are all strongly interested in identifying adverse reactions to drugs. While a clinical trial of a drug may use only a thousand patients, once a drug is released on the market it may be taken by millions of patients. As a result, in many cases adverse drug events (ADEs) are observed in the broader population that were not identified during clinical trials. Therefore, there is a need for continued, postmarketing surveillance of drugs to identify previously-unanticipated ADEs. This paper casts this problem as a reverse machine learning task, related to relational subgroup discovery and provides an initial evaluation of this approach based on experiments with an actual EMR/EHR and known adverse drug events.


Identifying Bullies with a Computer Game

AAAI Conferences

Current computer involvement in adolescent social networks (youth between the ages of 11 and 17) provides new opportunities to study group dynamics, interactions amongst peers, and individual preferences. Nevertheless, most of the research in this area focuses on efficiently retrieving information that is explicit in large social networks (e.g., properties of the graph structure), but not on how to use the dynamics of the virtual social network to discover latent characteristics of the real-world social network. In this paper, we present the analysis of a game designed to take advantage of the familiarity of adolescents with online social networks, and describe how the data generated by the game can be used to identify bullies in 5th grade classrooms. We present a probabilistic model of the game and using the in-game interactions of the players (i.e., content of chat messages) infer their social role within their classroom (either a bully or non-bully). The evaluation of our model is done by using previously collected data from psychological surveys on the same 5th grade population and by comparing the performance of the new model with off-the-shelf classifiers.


Construction of New Medicines via Game Proof Search

AAAI Conferences

The production of any new medicine requires solutions to many planning problems. The most fundamental of these is determining the sequence of chemical reactions necessary to physically create the drug. Surprisingly, these organic syntheses can be modeled as branching paths in a discrete, fully-observable state space, making the construction of new medicines an application of heuristic search. We describe a model of organic chemistry that is amenable to traditional AI techniques from game tree search, regression, and automatic assembly sequencing. We demonstrate the applicability of AND/OR graph search by developing the first chemistry solver to use proof-number search. Finally, we construct a benchmark suite of organic synthesis problems collected from undergraduate organic chemistry exams, and we analyze our solvers performance both on this suite and in recreating the synthetic plan for a multibillion dollar drug.