Automatic Identification of Key Concepts in Large PubMed Retrievals
Yeganova, Lana (National Library of Medicine, National Institutes of Health) | Grigoryan, Vahan (National Library of Medicine, National Institutes of Health) | Kim, Won (National Library of Medicine, National Institutes of Health) | Wilbur, W. John (National Library of Medicine, National Institutes of Health)
PubMed queries frequently retrieve thousands of documents making it very challenging for a user to identify information of interest. In this paper we propose a method for automatically identifying central concepts in large PubMed retrievals. The centrality of concept is modeled using the hypergeometric distribution. Retrieved documents are grouped by concept, which can help users navigate the retrieval. We test our method on five datasets, each representing a medical condition.
Nov-5-2012