MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation

Gorti, Satya Krishna, Gofman, Ilan, Liu, Zhaoyan, Wu, Jiapeng, Vouitsis, Noël, Yu, Guangwei, Cresswell, Jesse C., Hosseinzadeh, Rasa

Oct-16-2024–arXiv.org Artificial Intelligence

Text-to-SQL generation enables non-experts to interact with databases via natural language. Recent advances rely on large closed-source models like GPT-4 that present challenges in accessibility, privacy, and latency. To address these issues, we focus on developing small, efficient, and open-source text-to-SQL models. We demonstrate the benefits of sampling multiple candidate SQL generations and propose our method, MSc-SQL, to critique them using associated metadata. Our sample critiquing model evaluates multiple outputs simultaneously, achieving state-of-the-art performance compared to other open-source models while remaining competitive with larger models at a much lower cost. Full code can be found at github.com/layer6ai-labs/msc-sql.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Oct-16-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada > Ontario
    - Toronto (0.15)
  - United States > California (0.94)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Education (0.94)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language
    - Chatbot (1.00)
    - Large Language Model (1.00)