Fu, Zhongyang
Exacting Social Events for Tweets Using a Factor Graph
Liu, Xiaohua (Harbin Institute of Technology) | Zhou, Xiangyang (icrosoft Research Asia) | Fu, Zhongyang (Shanghai Jiao Tong University) | Wei, Furu (Microsoft Research Asia) | Zhou, Ming (Microsoft Research Asia)
Social events are events that occur between people where at least one person is aware of the other and of the event taking place. Extracting social events can play an important role in a wide range of applications, such as the construction of social network. In this paper, we introduce the task of social event extraction for tweets, an important source of fresh events. One main challenge is the lack of information in a single tweet, which is rooted in the short and noise-prone nature of tweets. We propose to collectively extract social events from multiple similar tweets using a novel factor graph, to harvest the redundance in tweets, i.e., the repeated occurrences of a social event in several tweets. We evaluate our method on a human annotated data set, and show that it outperforms all baselines, with an absolute gain of 21% in F1.
Collective Nominal Semantic Role Labeling for Tweets
Liu, Xiaohua (Harbin Institute of Technology) | Fu, Zhongyang (Shanghai Jiao Tong University) | Wei, Furu (Microsoft Research Asia) | Zhou, Ming (Microsoft Research Asia)
Tweets have become an increasingly popular source of fresh information. We investigate the task of Nominal Semantic Role Labeling (NSRL) for tweets, which aims to identify predicate-argument structures defined by nominals in tweets. Studies of this task can help fine-grained information extraction and retrieval from tweets. There are two main challenges in this task: 1) The lack of information in a single tweet, rooted in the short and noisy nature of tweets; and 2) recovery of implicit arguments. We propose jointly conducting NSRL on multiple similar tweets using a graphical model, leveraging the redundancy in tweets to tackle these challenges. Extensive evaluations on a human annotated data set demonstrate that our method outperforms two baselines with an absolute gain of 2.7% in F1.