Uncovering the Semantics of Wikipedia Categories
Heist, Nicolas, Paulheim, Heiko
–arXiv.org Artificial Intelligence
Two of the most prominent public knowledge graphs, DBpedia [16] and YAGO [18], build rich taxonomies using Wikipedia's infoboxes and category graph, respectively. They describe more than five million entities and contain multiple hundred millions of triples [27]. When it comes to relation assertions (RAs), however, we observe - even for basic properties - a rather low coverage: More than 50% of the 1.35 million persons in DBpedia have no birthplace assigned; even more than 80% of birthplaces are missing in YAGO. At the same time, type assertions (TAs) are not present as well for many instances - for example, there are about half a million persons in DBpedia not explicitly typed as such [23]. Missing knowledge in Wikipedia-based knowledge graphs can be attributed to absent information in Wikipedia, but also to the extraction procedures of knowledge graphs. DBpedia uses infobox mappings to extract RAs for individual instances, but it does not explicate any information implicitly encoded in categories. YAGO uses manually defined patterns to assign RAs to entities of matching categories. For example, they extract a person's year of birth by
arXiv.org Artificial Intelligence
Jun-28-2019
- Country:
- Asia
- India (0.04)
- Middle East > Iran
- East Azerbaijan Province > Tabriz (0.04)
- Europe > Germany (0.04)
- North America > United States
- Arizona (0.04)
- Michigan (0.04)
- Missouri (0.04)
- New York (0.04)
- Pennsylvania > Chester County (0.04)
- Asia
- Genre:
- Research Report (0.82)
- Industry:
- Leisure & Entertainment (1.00)
- Media > Music (0.47)
- Technology: