Yesterday's News: Benchmarking Multi-Dimensional Out-of-Distribution Generalisation of Misinformation Detection Models

Verhoeven, Ivo, Mishra, Pushkar, Shutova, Ekaterina

Oct-12-2024–arXiv.org Artificial Intelligence

This paper introduces misinfo-general, a benchmark dataset for evaluating misinformation models' ability to perform out-of-distribution generalisation. Misinformation changes rapidly, much quicker than moderators can annotate at scale, resulting in a shift between the training and inference data distributions. As a result, misinformation models need to be able to perform out-of-distribution generalisation, an understudied problem in existing datasets. We identify 6 axes of generalisation-time, event, topic, publisher, political bias, misinformation type-and design evaluation procedures for each. We also analyse some baseline models, highlighting how these fail important desiderata.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Oct-12-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Georgia > Fulton County
      - Atlanta (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - Austria > Vienna (0.14)
  - Switzerland (0.04)
  - Portugal (0.04)
  - Netherlands > North Holland
    - Amsterdam (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - United Kingdom > England
    - Greater London > London (0.04)
  - Norway > Eastern Norway
    - Oslo (0.04)
  - Germany > Baden-Württemberg
    - Stuttgart Region > Stuttgart (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Singapore (0.04)
  - Middle East > UAE (0.04)

Genre:
- Research Report
  - New Finding (0.68)
  - Experimental Study (0.46)

Industry:
- Media > News (1.00)
- Health & Medicine > Therapeutic Area
  - Infections and Infectious Diseases (0.68)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Information Management (0.93)
  - Artificial Intelligence
    - Machine Learning > Statistical Learning (0.67)
    - Natural Language
      - Large Language Model (0.47)
      - Text Processing (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found