Smoothing Entailment Graphs with Language Models

McKenna, Nick, Li, Tianyi, Johnson, Mark, Steedman, Mark

Sep-21-2023–arXiv.org Artificial Intelligence

The diversity and Zipfian frequency distribution of natural language predicates in corpora leads to sparsity in Entailment Graphs (EGs) built by Open Relation Extraction (ORE). EGs are computationally efficient and explainable models of natural language inference, but as symbolic models, they fail if a novel premise or hypothesis vertex is missing at test-time. We present theory and methodology for overcoming such sparsity in symbolic models. First, we introduce a theory of optimal smoothing of EGs by constructing transitive chains. We then demonstrate an efficient, open-domain, and unsupervised smoothing method using an off-the-shelf Language Model to find approximations of missing premise predicates. This improves recall by 25.1 and 16.3 percentage points on two difficult directional entailment datasets, while raising average precision and maintaining model explainability. Further, in a QA task we show that EG smoothing is most useful for answering questions with lesser supporting text, where missing premise predicates are more costly. Finally, controlled experiments with WordNet confirm our theory and show that hypothesis smoothing is difficult, but possible in principle.

computational linguistic, p-smoothing, predicate, (13 more...)

arXiv.org Artificial Intelligence

Sep-21-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States
    - New Jersey (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Michigan > Washtenaw County
      - Ann Arbor (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - California > Santa Cruz County
      - Santa Cruz (0.04)
- Europe
  - Italy (0.04)
  - Germany > Berlin (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- Asia
  - Middle East
    - Jordan (0.04)
    - UAE > Abu Dhabi Emirate
      - Abu Dhabi (0.04)
  - Japan > Honshū
    - Kansai > Osaka Prefecture > Osaka (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (0.50)
  - Machine Learning
    - Statistical Learning (0.48)
    - Performance Analysis (0.35)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found