Surprising Efficacy of Fine-Tuned Transformers for Fact-Checking over Larger Language Models
–arXiv.org Artificial Intelligence
In this paper, we explore the challenges associated with establishing an end-to-end fact-checking pipeline in a real-world context, covering over 90 languages. Our real-world experimental benchmarks demonstrate that fine-tuning Transformer models specifically for fact-checking tasks, such as claim detection and veracity prediction, provide superior performance over large language models (LLMs) like GPT-4, GPT-3.5-Turbo, and Mistral-7b. However, we illustrate that LLMs excel in generative tasks such as question decomposition for evidence retrieval. Through extensive evaluation, we show the efficacy of fine-tuned models for fact-checking in a multilingual setting and complex claims that include numerical quantities.
arXiv.org Artificial Intelligence
Apr-30-2024
- Country:
- Africa > Middle East
- Somalia (0.04)
- Asia > Indonesia
- Bali (0.04)
- Europe
- Netherlands > South Holland
- Delft (0.04)
- Norway (0.04)
- Netherlands > South Holland
- North America > United States
- District of Columbia > Washington (0.05)
- New York > New York County
- New York City (0.04)
- Africa > Middle East
- Genre:
- Research Report > New Finding (0.34)
- Technology: