ReasonBERT: Pre-trained to Reason with Distant Supervision

Deng, Xiang, Su, Yu, Lees, Alyssa, Wu, You, Yu, Cong, Sun, Huan

Sep-10-2021–arXiv.org Artificial Intelligence

We present ReasonBert, a pre-training method that augments language models with the ability to reason over long-range relations and multiple, possibly hybrid contexts. Unlike existing pre-training methods that only harvest learning signals from local contexts of naturally occurring texts, we propose a generalized notion of distant supervision to automatically connect multiple pieces of text and tables to create pre-training examples that require long-range reasoning. Different types of reasoning are simulated, including intersecting multiple pieces of evidence, bridging from one piece of evidence to another, and detecting unanswerable cases. We conduct a comprehensive evaluation on a variety of extractive question answering datasets ranging from single-hop to multi-hop and from text-only to table-only to hybrid that require various reasoning capabilities and show that ReasonBert achieves remarkable improvement over an array of strong baselines. Few-shot experiments further demonstrate that our pre-training method substantially improves sample efficiency.

computational linguistic, dataset, reasoning, (14 more...)

arXiv.org Artificial Intelligence

Sep-10-2021

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Texas (0.04)
    - Ohio > Franklin County
      - Columbus (0.04)
    - New York > New York County
      - New York City (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Illinois > Cook County
      - Chicago (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - Italy > Tuscany
    - Florence (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Singapore (0.04)
  - Middle East > Jordan (0.04)
  - Taiwan > Taiwan Province
    - Taipei (0.04)
- Africa > Ethiopia
  - Addis Ababa > Addis Ababa (0.04)

Genre:
- Research Report (0.50)

Industry:
- Leisure & Entertainment (1.00)
- Government (0.93)
- Media > Film (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (0.93)
  - Machine Learning > Inductive Learning (0.69)