FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs
Wan, Yingjia, Tan, Haochen, Zhu, Xiao, Zhou, Xinyu, Li, Zhiwei, Lv, Qingsong, Sun, Changxuan, Zeng, Jiaqi, Xu, Yi, Lu, Jianqiao, Liu, Yinhong, Guo, Zhijiang
–arXiv.org Artificial Intelligence
Evaluating the factuality of long-form generations from Large Language Models (LLMs) remains challenging due to efficiency bottlenecks and reliability concerns. Prior efforts attempt this by decomposing text into claims, searching for evidence, and verifying claims, but suffer from critical drawbacks: (1) inefficiency due to overcomplicated pipeline components, and (2) ineffectiveness stemming from inaccurate claim sets and insufficient evidence. To address these limitations, we propose \textbf{FaStfact}, an evaluation framework that achieves the highest alignment with human evaluation and time/token efficiency among existing baselines. FaStfact first employs chunk-level claim extraction integrated with confidence-based pre-verification, significantly reducing the time and token cost while ensuring reliability. For searching and verification, it collects document-level evidence from crawled web-pages and selectively retrieves it during verification. Extensive experiments based on an annotated benchmark \textbf{FaStfact-Bench} demonstrate the reliability of FaStfact in both efficiently and effectively evaluating long-form factuality. Code, benchmark data, and annotation interface tool are available at https://github.com/Yingjia-Wan/FaStfact.
arXiv.org Artificial Intelligence
Nov-6-2025
- Country:
- Africa > Middle East (0.04)
- Asia
- Indonesia > Bali (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Singapore (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Europe
- Portugal (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Spain (0.04)
- Middle East > Malta (0.04)
- Germany > Bavaria
- Middle Franconia > Nuremberg (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Austria > Vienna (0.14)
- North America
- Canada (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Washington > King County
- Seattle (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- Florida > Miami-Dade County
- Miami (0.14)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- Massachusetts > Middlesex County
- Concord (0.04)
- California > Santa Clara County
- Sunnyvale (0.04)
- New York > Bronx County
- New York City (0.04)
- Alaska > Fairbanks North Star Borough
- Fairbanks (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- New Jersey > Bergen County
- Rutherford (0.14)
- Teaneck (0.04)
- Maine (0.04)
- Washington > King County
- Oceania > Australia (0.04)
- Genre:
- Research Report (0.64)
- Industry:
- Education (0.93)
- Government (0.68)
- Information Technology (1.00)
- Leisure & Entertainment > Sports
- Media > Music (0.67)
- Telecommunications (1.00)
- Technology: