A Wikipedia Based Semantic Graph Model for Topic Tracking in Blogosphere
Tang, Jintao (National University of Defense Technology) | Wang, Ting (National University of Defense Technology) | Lu, Qin (Hong Kong Polytechnic University) | Wang, Ji (National Laboratory for Parallel and Distributed Processing) | Li, Wenjie (Hong Kong Polytechnic University)
There are two key issues for information diffusion in blogosphere: (1) blog posts are usually short, noisy and contain multiple themes, (2) information diffusion through blogosphere is primarily driven by the “word-of-mouth” effect, thus making topics evolve very fast. This paper presents a novel topic tracking approach to deal with these issues by modeling a topic as a semantic graph in which the semantic relatedness between terms are learned from Wikipedia. For a given topic/post, the named entities, Wikipedia concepts, and the semantic relatedness are extracted to generate the graph model. Noises are filtered out through a graph clustering algorithm. To handle topic evolution, the topic model is enriched by using Wikipedia as background knowledge. Furthermore, graph edit distance is used to measure the similarity between a topic and its posts. The proposed method is tested using real-world blog data. Experimental results show the advantage of the proposed method on tracking topics in short, noisy text.
Jul-19-2011
- Country:
- Africa
- Mozambique (0.04)
- Southern Africa (0.04)
- Asia
- North America
- Canada (0.04)
- United States
- California (0.04)
- New York (0.04)
- Oceania > Australia (0.04)
- Africa
- Genre:
- Research Report (0.48)
- Technology: