Embedded Topic Models Enhanced by Wikification
Shibuya, Takashi, Utsuro, Takehito
–arXiv.org Artificial Intelligence
Topic modeling analyzes a collection of documents to learn meaningful patterns of words. However, previous topic models consider only the spelling of words and do not take into consideration the homography of words. In this study, we incorporate the Wikipedia knowledge into a neural topic model to make it aware of named entities. We evaluate our method on two datasets, 1) news articles of \textit{New York Times} and 2) the AIDA-CoNLL dataset. Our experiments show that our method improves the performance of neural topic models in generalizability. Moreover, we analyze frequent terms in each topic and the temporal dependencies between topics to demonstrate that our entity-aware topic models can capture the time-series development of topics well.
arXiv.org Artificial Intelligence
Oct-3-2024
- Country:
- Asia
- Europe
- Germany > Berlin (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- United Kingdom > Scotland
- City of Edinburgh > Edinburgh (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Colorado > Denver County
- Denver (0.04)
- Maryland > Baltimore (0.04)
- New Mexico (0.04)
- New York > New York County
- New York City (0.05)
- Virginia > Arlington County
- Arlington (0.04)
- Colorado > Denver County
- Canada > Ontario
- Oceania > Australia
- Genre:
- Research Report > New Finding (0.34)
- Technology: