Open Domain Short Text Conceptualization: A Generative + Descriptive Modeling Approach
Song, Yangqiu (University of Illinois at Urbana-Champaign) | Wang, Shusen (Zhejiang University) | Wang, Haixun (Google)
Concepts embody the knowledge to facilitate our cognitive processes of learning. Mapping short texts to a large set of open domain concepts has gained many successful applications. In this paper, we unify the existing conceptualization methods from a Bayesian perspective, and discuss the three modeling approaches: descriptive, generative, and discriminative models. Motivated by the discussion of their advantages and shortcomings, we develop a generative + descriptive modeling approach. Our model considers term relatedness in the context, and will result in disambiguated conceptualization. We show the results of short text clustering using a news title data set and a Twitter message data set, and demonstrate the effectiveness of the developed approach compared with the state-of-the-art conceptualization and topic modeling approaches.
Jul-15-2015
- Country:
- North America
- Canada (0.04)
- United States
- Asia
- Middle East > Jordan (0.04)
- Japan (0.04)
- India (0.04)
- China (0.04)
- North America
- Technology: