SQUiD: Synthesizing Relational Databases from Unstructured Text
Sadia, Mushtari, Yang, Zhenning, Xiao, Yunming, Chen, Ang, Chowdhury, Amrita Roy
–arXiv.org Artificial Intelligence
Relational databases are central to modern data management, yet most data exists in unstructured forms like text documents. To bridge this gap, we leverage large language models (LLMs) to automatically synthesize a relational database by generating its schema and populating its tables from raw text. We introduce SQUiD, a novel neurosymbolic framework that decomposes this task into four stages, each with specialized techniques. Our experiments show that SQUiD consistently outperforms baselines across diverse datasets.
arXiv.org Artificial Intelligence
May-27-2025
- Country:
- Asia
- North America > United States
- California > Los Angeles County
- Los Angeles (0.04)
- Maryland > Baltimore (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Michigan (0.04)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- California > Los Angeles County
- Genre:
- Research Report > New Finding (0.46)
- Technology: