Solid-SQL: Enhanced Schema-linking based In-context Learning for Robust Text-to-SQL
Liu, Geling, Tan, Yunzhi, Zhong, Ruichao, Xie, Yuanzhen, Zhao, Lingchen, Wang, Qian, Hu, Bo, Li, Zang
–arXiv.org Artificial Intelligence
Recently, large language models (LLMs) have significantly improved the performance of text-to-SQL systems. Nevertheless, many state-of-the-art (SOTA) approaches have overlooked the critical aspect of system robustness. Our experiments reveal that while LLM-driven methods excel on standard datasets, their accuracy is notably compromised when faced with adversarial perturbations. To address this challenge, we propose a robust text-to-SQL solution, called Solid-SQL, designed to integrate with various LLMs. We focus on the pre-processing stage, training a robust schema-linking model enhanced by LLM-based data augmentation. Additionally, we design a two-round, structural similarity-based example retrieval strategy for in-context learning. Our method achieves SOTA SQL execution accuracy levels of 82.1% and 58.9% on the general Spider and Bird benchmarks, respectively. Furthermore, experimental results show that Solid-SQL delivers an average improvement of 11.6% compared to baselines on the perturbed Spider-Syn, Spider-Realistic, and Dr. Spider benchmarks.
arXiv.org Artificial Intelligence
Dec-16-2024
- Country:
- Asia
- China > Hubei Province
- Wuhan (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- China > Hubei Province
- Asia
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology > Security & Privacy (0.46)
- Technology: