Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System

He, Haorui, Li, Yupeng, Zhu, Bin Benjamin, Wen, Dacheng, Cheng, Reynold, Lau, Francis C. M.

Nov-18-2025–arXiv.org Artificial Intelligence

State-of-the-art (SOTA) fact-checking systems combat misinformation by employing autonomous LLM-based agents to decompose complex claims into smaller sub-claims, verify each sub-claim individually, and aggregate the partial results to produce verdicts with justifications (explanations for the verdicts). The security of these systems is crucial, as compromised fact-checkers can amplify misinformation, but remains largely underexplored. To bridge this gap, this work introduces a novel threat model against such fact-checking systems and presents \textsc{Fact2Fiction}, the first poisoning attack framework targeting SOTA agentic fact-checking systems. Fact2Fiction employs LLMs to mimic the decomposition strategy and exploit system-generated justifications to craft tailored malicious evidences that compromise sub-claim verification. Extensive experiments demonstrate that Fact2Fiction achieves 8.9\%--21.2\% higher attack success rates than SOTA attacks across various poisoning budgets and exposes security weaknesses in existing fact-checking systems, highlighting the need for defensive countermeasures.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

Nov-18-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China
  - Hong Kong (0.05)
- Oceania > New Zealand (0.14)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Government (1.00)
- Information Technology > Security & Privacy (0.68)
- Media > News (0.87)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.48)
    - Natural Language > Large Language Model (1.00)
  - Communications > Social Media (0.93)