Iterative Document-level Information Extraction via Imitation Learning

Chen, Yunmo, Gantt, William, Gu, Weiwei, Chen, Tongfei, White, Aaron Steven, Van Durme, Benjamin

May-1-2023–arXiv.org Artificial Intelligence

We present a novel iterative extraction model, IterX, for extracting complex relations, or templates (i.e., N-tuples representing a mapping from named slots to spans of text) within a document. Documents may feature zero or more instances of a template of any given type, and the task of template extraction entails identifying the templates in a document and extracting each template's slot values. Our imitation learning approach casts the problem as a Markov decision process (MDP), and relieves the need to use predefined template orders to train an extractor. It leads to state-of-the-art results on two established benchmarks -- 4-ary relation extraction on SciREX and template extraction on MUC-4 -- as well as a strong baseline on the new BETTER Granular task.

machine learning, natural language, template, (18 more...)

arXiv.org Artificial Intelligence

May-1-2023

arXiv.org PDF

Add feedback

Country:
- South America (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America
  - Dominican Republic (0.04)
  - United States
    - Texas > Travis County
      - Austin (0.04)
    - Oregon > Multnomah County
      - Portland (0.04)
    - New Mexico > Santa Fe County
      - Santa Fe (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Florida > Broward County
      - Fort Lauderdale (0.04)
    - California > San Diego County
      - San Diego (0.04)
  - Canada
    - Quebec > Montreal (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.04)
- Europe
  - Ukraine > Kyiv Oblast
    - Kyiv (0.04)
  - Sweden > Uppsala County
    - Uppsala (0.04)
  - Spain > Valencian Community
    - Valencia Province > Valencia (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Germany > Brandenburg
    - Potsdam (0.04)
  - France > Hauts-de-France
    - Nord > Lille (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
- Asia
  - Myanmar (0.04)
  - Singapore (0.04)
  - China > Yunnan Province (0.04)
  - Middle East
    - UAE (0.04)
    - Qatar > Ad-Dawhah
      - Doha (0.04)
  - India > Maharashtra
    - Mumbai (0.04)
- Africa > Togo
  - Maritime Region > Lome (0.04)

Genre:
- Workflow (0.67)
- Research Report > New Finding (0.46)

Industry:
- Law Enforcement & Public Safety (0.67)
- Government
  - Regional Government (0.93)
  - Immigration & Customs (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Information Extraction (0.82)
  - Machine Learning > Neural Networks (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found