DeepCPCFG: Deep Learning and Context Free Grammars for End-to-End Information Extraction
Chua, Freddy C., Duffy, Nigel P.
–arXiv.org Artificial Intelligence
We combine deep learning and Conditional Probabilistic Context Free Grammars (CPCFG) to create an end-to-end system for extracting structured information from complex documents. For each class of documents, we create a CPCFG that describes the structure of the information to be extracted. Conditional probabilities are modeled by deep neural networks. We use this grammar to parse 2-D documents to directly produce structured records containing the extracted information. This system is trained end-to-end with (Document, Record) pairs. We apply this approach to extract information from scanned invoices achieving state-of-the-art results.
arXiv.org Artificial Intelligence
Mar-10-2021
- Country:
- North America > United States
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Massachusetts
- Suffolk County > Boston (0.04)
- Middlesex County > Cambridge (0.04)
- California
- San Francisco County > San Francisco (0.14)
- Santa Clara County > Palo Alto (0.04)
- San Diego County > San Diego (0.04)
- New York > New York County
- Europe
- Asia
- North America > United States
- Genre:
- Research Report (0.82)
- Industry:
- Law (0.46)
- Technology: