Auto-ARGUE: LLM-Based Report Generation Evaluation
Walden, William, Mason, Marc, Weller, Orion, Dietz, Laura, Conroy, John, Molino, Neil, Recknor, Hannah, Li, Bryan, Liu, Gabrielle Kaili-May, Hou, Yu, Lawrie, Dawn, Mayfield, James, Yang, Eugene
–arXiv.org Artificial Intelligence
Generation of long-form, citation-backed reports is a primary use case for retrieval augmented generation (RAG) systems. While open-source evaluation tools exist for various RAG tasks, ones tailored to report generation (RG) are lacking. Accordingly, we introduce Auto-ARGUE, a robust LLM-based implementation of the recently proposed ARGUE framework for RG evaluation.
arXiv.org Artificial Intelligence
Oct-20-2025
- Country:
- North America
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Maryland (0.04)
- New Hampshire (0.04)
- Pennsylvania (0.04)
- Mexico > Mexico City
- North America
- Genre:
- Research Report (0.40)
- Technology: