NarraBench: A Comprehensive Framework for Narrative Benchmarking

Hamilton, Sil, Wilkens, Matthew, Piper, Andrew

Dec-2-2025–arXiv.org Artificial Intelligence

We present NarraBench, a theory-informed taxonomy of narrative-understanding tasks, as well as an associated survey of 78 existing benchmarks in the area. We find significant need for new evaluations covering aspects of narrative understanding that are either overlooked in current work or are poorly aligned with existing metrics. Specifically, we estimate that only 27% of narrative tasks are well captured by existing benchmarks, and we note that some areas -- including narrative events, style, perspective, and revelation -- are nearly absent from current evaluations. We also note the need for increased development of benchmarks capable of assessing constitutively subjective and perspectival aspects of narrative, that is, aspects for which there is generally no single correct answer. Our taxonomy, survey, and methodology are of value to NLP researchers seeking to test LLM narrative understanding.

benchmark, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Dec-2-2025

arXiv.org PDF

Add feedback

Country:
- Europe (0.93)
- North America
  - United States (1.00)
  - Canada > Quebec (0.28)
- Asia > Middle East
  - UAE (0.46)

Genre:
- Overview (0.93)
- Research Report > New Finding (0.67)

Technology:
- Information Technology
  - Communications (0.93)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Machine Learning (1.00)
    - Natural Language
      - Large Language Model (0.71)
      - Discourse & Dialogue (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found