Retrieval-Augmented Generation of Ontologies from Relational Databases

Nayyeri, Mojtaba, Yogi, Athish A, Fathallah, Nadeen, Thapa, Ratan Bahadur, Tautenhahn, Hans-Michael, Schnurpel, Anton, Staab, Steffen

arXiv.org Artificial Intelligence 

Transforming relational databases into knowledge graphs with enriched ontologies enhances semantic interoperability and unlocks advanced graph-based learning and reasoning over data. However, previous approaches either demand significant manual effort to derive an ontology from a database schema or produce only a basic ontology. We present RIGOR--Retrieval-augmented Iterative Generation of RDB Ontologies--an LLM-driven approach that turns relational schemas into rich OWL ontologies with minimal human effort. RIGOR combines three sources via RAG--the database schema and its documentation, a repository of domain ontologies, and a growing core ontology--to prompt a generative LLM for producing successive, provenance-tagged "delta ontology" fragments. Each fragment is refined by a judge-LLM before being merged into the core ontology, and the process iterates table-by-table following foreign key constraints until coverage is complete.