text-to-sql translation
Auto prompt sql: a resource-efficient architecture for text-to-sql translation in constrained environments
Tang, Zetong, Ma, Qian, Wu, Di
Using the best Text-to-SQL methods in resource-constrained environments is challenging due to their reliance on resource-intensive open-source models. This paper introduces Auto Prompt SQL(AP-SQL), a novel architecture designed to bridge the gap between resource-efficient small open-source models and the powerful capabilities of large closed-source models for Text-to-SQL translation. Our method decomposes the task into schema filtering, retrieval-augmented text-to-SQL generation based on in-context examples, and prompt-driven schema linking and SQL generation. To improve schema selection accuracy, we fine-tune large language models. Crucially, we also explore the impact of prompt engineering throughout the process, leveraging Chain-of-Thought(CoT) and Graph-of-Thought(GoT) templates to significantly enhance the model's reasoning for accurate SQL generation. Comprehensive evaluations on the Spider benchmarks demonstrate the effectiveness of AP-SQL.
SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation
Bazaga, Adrián, Liò, Pietro, Micklem, Gos
In recent years, there has been growing interest in text-to-SQL translation, which is the task of converting natural language questions into executable SQL queries. This technology is important for its potential to democratize data extraction from databases. However, some of its key hurdles include domain generalisation, which is the ability to adapt to previously unseen databases, and alignment of natural language questions with the corresponding SQL queries. To overcome these challenges, we introduce SQLformer, a novel Transformer architecture specifically crafted to perform text-to-SQL translation tasks. Our model predicts SQL queries as abstract syntax trees (ASTs) in an autoregressive way, incorporating structural inductive bias in the encoder and decoder layers. This bias, guided by database table and column selection, aids the decoder in generating SQL query ASTs represented as graphs in a Breadth-First Search canonical order. Comprehensive experiments illustrate the state-of-the-art performance of SQLformer in the challenging text-to-SQL Spider benchmark. Our implementation is available at https://github.com/AdrianBZG/SQLformer