Semantic Enrichments in Text Supervised Classification: Application to Medical Domain
Albitar, Shereen (Aix-Marseille Université, LSIS) | Espinasse, Bernard (Aix-Marseille Université, LSIS) | Fournier, Sébastien (Aix-Marseille Université, LSIS)
The use of semantics in supervised text classification can improve its effectiveness especially in specific domains. Most state of the art works use concepts as an alternative to words in order to transform the classical bag of words (BOW) into a Bag of concepts (BOC). This transformation is done through conceptualization task. Furthermore, the resulting BOC can be enriched using other related concepts from semantic resources. This enrichment may enhance classification effectiveness as well. This paper focuses on two strategies for semantic enrichment of conceptualized text representation. The first one is based on semantic kernel method while the second one is based on enriching vectors method. These two semantic enrichment strategies are evaluated through experiments using Rocchio as the supervised classification method in the medical domain, using UMLS ontology and Ohsumed corpus.
May-7-2014
- Technology: