Parser Extraction of Triples in Unstructured Text

Nov-6-2018–arXiv.org Artificial Intelligence

The web contains vast repositories of unstructured text. We investigate the opportunity for building a knowledge graph from these text sources. We generate a set of triples which can be used in knowledge gathering and integration. We define the architecture of a language compiler for processing subject-predicate-object triples using the OpenNLP parser. We implement a depth-first search traversal on the POS tagged syntactic tree appending predicate and object information. A parser enables higher precision and higher recall extractions of syntactic relationships across conjunction boundaries. We are able to extract 2-2.5 times the correct extractions of ReVerb. The extractions are used in a variety of semantic web applications and question answering. We verify extraction of 50,000 triples on the ClueWeb dataset.

artificial intelligence, extraction, natural language, (15 more...)

arXiv.org Artificial Intelligence

Nov-6-2018

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Michigan (0.14)
- Asia > India
  - Karnataka (0.14)

Genre:
- Research Report (0.40)

Industry:
- Government (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language
    - Text Processing (1.00)
    - Grammars & Parsing (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found