A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution

Ding, Bowen, Min, Qingkai, Ma, Shengkun, Li, Yingjie, Yang, Linyi, Zhang, Yue

May-8-2024–arXiv.org Artificial Intelligence

Based on Pre-trained Language Models (PLMs), event coreference resolution (ECR) systems have demonstrated outstanding performance in clustering coreferential events across documents. However, the state-of-the-art system exhibits an excessive reliance on the'triggers lexical matching' spurious pattern in the input mention pair text. We formalize the decision-making process of the baseline ECR system using a Structural Causal Model (SCM), aiming to identify spurious and causal associations (i.e., rationales) within the ECR task. Leveraging the debiasing capability of counterfactual data augmentation, we develop a rationale-centric counterfactual data augmentation method with LLM-in-the-loop. This method is specialized for pairwise input in the Figure 1: The distribution of'triggers lexical matching' ECR system, where we conduct direct interventions in mention pairs from ECB+ training set, along with a on triggers and context to mitigate the false negative example from Held et al.'s system which spurious association while emphasizing the causation.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

May-8-2024

arXiv.org PDF

Add feedback

Country:
- Asia
  - China
    - Beijing > Beijing (0.04)
    - Hong Kong (0.04)
  - Indonesia > New Guinea
    - Western New Guinea
      - Papua (0.04)
      - West Papua (0.04)
  - Japan > Kyūshū & Okinawa
    - Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
  - Singapore (0.04)
- Europe
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
  - Croatia (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - France (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Sweden > Uppsala County
    - Uppsala (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Germany > Berlin (0.04)
  - Iceland > Capital Region
    - Reykjavik (0.04)
  - Greece > Central Macedonia
    - Thessaloniki (0.04)
- North America
  - Canada
    - Ontario > Toronto (0.04)
    - Quebec > Montreal (0.04)
  - Dominican Republic (0.04)
  - United States
    - California
      - Los Angeles County > Los Angeles (0.04)
      - Orange County > Newport Beach (0.04)
      - Riverside County > Rancho Mirage (0.04)
    - Indiana > Marion County
      - Indianapolis (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Maryland > Howard County
      - Columbia (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Missouri > Jackson County
      - Kansas City (0.14)
    - New York (0.04)
    - Washington > King County
      - Seattle (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- South America > Uruguay (0.04)

Genre:
- Personal > Obituary (1.00)
- Research Report (1.00)

Industry:
- Health & Medicine (0.68)
- Information Technology > Security & Privacy (1.00)
- Leisure & Entertainment > Sports
  - Football (1.00)
  - Soccer (0.92)
- Media > Film (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (0.47)
    - Performance Analysis > Accuracy (0.54)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found