Reasoning with RAGged events: RAG-Enhanced Event Knowledge Base Construction and reasoning with proof-assistants

Chatzikyriakidis, Stergios

arXiv.org Artificial Intelligence 

Extracting structured representations of historical events from narrative sources still remains challenging when one constructs them manually. While RDF/OWL reasoners support graph-based reasoning, their expressiveness is limited to restricted fragments of first-order logic. We develop automated models for historical event extraction using large language models (GPT-4, Claude, Llama 3.2) with three strategies: direct generation, knowledge-graph augmentation, and retrieval-augmented generation (RAG). Using the 10 first chapters of Thucydides works as a case study, we find that different enhancement strategies optimize different performance dimensions rather than providing across the board universal improvements. Direct generation favors coverage, while RAG improves precision but reduces breadth. Model architecture influences this trade-off: large models show stable baselines with incremental RAG benefits, while Llama 3.2 exhibits extreme variance from competitive to catastrophic performance. To address RDF's expressivity limitations, we develop a translation pipeline converting RDF outputs to Coq proof assistant specifications, enabling temporal arithmetic with BCE dates, multi-step causal inference, and formal validation of domain-specific event types. This demonstrates that optimal enhancement strategies depend on specific application requirements, while establishing foundations for computational humanities combining NLP scalability with formal verification.