Wikipedia-Based Distributional Semantics for Entity Relatedness
Aggarwal, Nitish (National University of Ireland, Galway) | Buitelaar, Paul (National University of Ireland, Galway)
Wikipedia provides an enormous amount of background knowledge to reason about the semantic relatedness between two entities. We propose Wikipedia-based Distributional Semantics for Entity Relatedness (DiSER), which represents the semantics of an entity by its distribution in the high dimensional concept space derived from Wikipedia. DiSER measures the semantic relatedness between two entities by quantifying the distance between the corresponding high-dimensional vectors. DiSER builds the model by taking the annotated entities only, therefore it improves over existing approaches, which do not distinguish between an entity and its surface form. We evaluate the approach on a benchmark that contains the relative entity relatedness scores for 420 entity pairs. Our approach improves the accuracy by 12% on state of the art methods for computing entity relatedness. We also show an evaluation of DiSER in the Entity Disambiguation task on a dataset of 50 sentences with highly ambiguous entity mentions. It shows an improvement of 10% in precision over the best performing methods. In order to provide the resource that can be used to find out all the related entities for a given entity, a graph is constructed, where the nodes represent Wikipedia entities and the relatedness scores are reflected by the edges. Wikipedia contains more than 4.1 millions entities, which required efficient computation of the relatedness scores between the corresponding 17 trillions of entity-pairs.
Nov-1-2014
- Country:
- Asia > Middle East
- Israel (0.14)
- Europe (1.00)
- North America > United States (0.46)
- Asia > Middle East
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Information Technology (0.94)
- Leisure & Entertainment (0.94)
- Media (0.69)
- Technology: