Ev2R: Evaluating Evidence Retrieval in Automated Fact-Checking
Akhtar, Mubashara, Schlichtkrull, Michael, Vlachos, Andreas
–arXiv.org Artificial Intelligence
Current automated fact-checking (AFC) approaches commonly evaluate evidence either implicitly via the predicted verdicts or by comparing retrieved evidence with a predefined closed knowledge source, such as Wikipedia. However, these methods suffer from limitations, resulting from their reliance on evaluation metrics developed for different purposes and constraints imposed by closed knowledge sources. Recent advances in natural language generation (NLG) evaluation offer new possibilities for evidence assessment. In this work, we introduce Ev2R, an evaluation framework for AFC that comprises three types of approaches for evidence evaluation: reference-based, proxy-reference, and reference-less. We evaluate their effectiveness through agreement with human ratings and adversarial tests, and demonstrate that prompt-based scorers, particularly those leveraging LLMs and reference evidence, outperform traditional evaluation approaches.
arXiv.org Artificial Intelligence
Nov-8-2024
- Country:
- North America
- Dominican Republic (0.04)
- United States
- Missouri (0.04)
- New York (0.04)
- Nebraska (0.04)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New Jersey > Essex County
- Newark (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- California > Santa Clara County
- Stanford (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- Alaska > Anchorage Municipality
- Anchorage (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- Europe
- Liechtenstein (0.04)
- Austria (0.04)
- Ireland (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Sweden > Östergötland County
- Linköping (0.04)
- United Kingdom
- Scotland (0.05)
- Wales (0.04)
- Northern Ireland (0.04)
- England > Cambridgeshire
- Cambridge (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Atlantic Ocean > South Atlantic Ocean
- Gulf of Guinea > Niger Delta (0.04)
- Asia
- China > Hong Kong (0.04)
- Singapore (0.04)
- Philippines (0.04)
- British Indian Ocean Territory > Diego Garcia (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- India
- NCT > New Delhi (0.04)
- Maharashtra > Mumbai (0.04)
- Africa
- South Africa (0.04)
- Nigeria > Niger Delta (0.04)
- Middle East > Egypt (0.04)
- Ethiopia > Addis Ababa
- Addis Ababa (0.04)
- North America
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Government (1.00)
- Health & Medicine > Therapeutic Area
- Infections and Infectious Diseases (0.46)
- Immunology (0.46)
- Technology: