Integration of UMLS and MEDLINE in Unsupervised Word Sense Disambiguation
Yepes, Antonio Jimeno (National Library of Medicine) | Aronson, Alan R. (National Library of Medicine)
Scarcity of training data for word sense disambiguation argues for the use of knowledge-based disambiguation methods, which rely on information available in terminological resources. Unfortunately, these resources are not generally optimized to perform word sense disambiguation. On the other hand, there are many examples of ambiguous biomedical words with context in MEDLINE. However, these examples of ambiguity are not labeled with their proper sense. We propose the integration of the UMLS and MEDLINE to create concept profiles which are used to perform knowledge-based word sense disambiguation. Our results show an accuracy of 0.8770 on a biomedical word sense disambiguation data set; this represents a statistically significant improvement over other knowledge-based methods based on the UMLS on this data set.
Nov-5-2012
- Country:
- North America > United States > Ohio (0.14)
- Genre:
- Research Report > New Finding (0.54)
- Industry:
- Technology: