MT-RAIG: Novel Benchmark and Evaluation Framework for Retrieval-Augmented Insight Generation over Multiple Tables

Seo, Kwangwook, Kwon, Donguk, Lee, Dongha

Feb-18-2025–arXiv.org Artificial Intelligence

Recent advancements in table-based reasoning have expanded beyond factoid-level QA to address insight-level tasks, where systems should synthesize implicit knowledge in the table to provide explainable analyses. Although effective, existing studies remain confined to scenarios where a single gold table is given alongside the user query, failing to address cases where users seek comprehensive insights from multiple unknown tables. To bridge these gaps, we propose MT-RAIG Bench, design to evaluate systems on Retrieval-Augmented Insight Generation over Mulitple-Tables. Additionally, to tackle the suboptimality of existing automatic evaluation methods in the table domain, we further introduce a fine-grained evaluation framework MT-RAIG Eval, which achieves better alignment with human quality judgments on the generated insights. We conduct extensive experiments and reveal that even frontier LLMs still struggle with complex multi-table reasoning, establishing our MT-RAIG Bench as a challenging testbed for future research.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Feb-18-2025

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- Asia (1.00)
- North America > United States (0.93)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Automobiles & Trucks > Manufacturer (1.00)
- Media > Television (0.92)
- Leisure & Entertainment > Sports
  - Soccer (0.94)

Technology:
- Information Technology
  - Information Management (0.92)
  - Artificial Intelligence
    - Natural Language
      - Large Language Model (1.00)
      - Chatbot (0.96)
    - Machine Learning > Neural Networks
      - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found