AITopics | kaggledbqa

Collaborating Authors

kaggledbqa

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL

Wang, Dingzirui, Dou, Longxu, Zhang, Xuanliang, Zhu, Qingfu, Che, Wanxiang

arXiv.org Artificial IntelligenceFeb-16-2024

Currently, the in-context learning method based on large language models (LLMs) has become the mainstream of text-to-SQL research. Previous works have discussed how to select demonstrations related to the user question from a human-labeled demonstration pool. However, human labeling suffers from the limitations of insufficient diversity and high labeling overhead. Therefore, in this paper, we discuss how to measure and improve the diversity of the demonstrations for text-to-SQL. We present a metric to measure the diversity of the demonstrations and analyze the insufficient of the existing labeled data by experiments. Based on the above discovery, we propose fusing iteratively for demonstrations (Fused) to build a high-diversity demonstration pool through human-free multiple-iteration synthesis, improving diversity and lowering label cost. Our method achieves an average improvement of 3.2% and 5.0% with and without human labeling on several mainstream datasets, which proves the effectiveness of Fused.

database, demonstration, diversity, (15 more...)

arXiv.org Artificial Intelligence

2402.10663

Country:

Asia > Singapore (0.05)
North America > Canada > Ontario > Toronto (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Improving Generalization in Semantic Parsing by Increasing Natural Language Variation

Saparina, Irina, Lapata, Mirella

arXiv.org Artificial IntelligenceFeb-13-2024

Text-to-SQL semantic parsing has made significant progress in recent years, with various models demonstrating impressive performance on the challenging Spider benchmark. However, it has also been shown that these models often struggle to generalize even when faced with small perturbations of previously (accurately) parsed expressions. This is mainly due to the linguistic form of questions in Spider which are overly specific, unnatural, and display limited variation. In this work, we use data augmentation to enhance the robustness of text-to-SQL parsers against natural language variations. Existing approaches generate question reformulations either via models trained on Spider or only introduce local changes. In contrast, we leverage the capabilities of large language models to generate more realistic and diverse questions. Using only a few prompts, we achieve a two-fold increase in the number of questions in Spider. Training on this augmented dataset yields substantial improvements on a range of evaluation sets, including robustness benchmarks and out-of-domain data.

augmentation, computational linguistic, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2402.08666

Country:

North America > United States > Utah (0.04)
North America > United States > Texas (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(9 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Selective Demonstrations for Cross-domain Text-to-SQL

Chang, Shuaichen, Fosler-Lussier, Eric

arXiv.org Artificial IntelligenceOct-10-2023

Large language models (LLMs) with in-context learning have demonstrated impressive generalization capabilities in the cross-domain text-to-SQL task, without the use of in-domain annotations. However, incorporating in-domain demonstration examples has been found to greatly enhance LLMs' performance. In this paper, we delve into the key factors within in-domain examples that contribute to the improvement and explore whether we can harness these benefits without relying on in-domain annotations. Based on our findings, we propose a demonstration selection framework ODIS which utilizes both out-of-domain examples and synthetically generated in-domain examples to construct demonstrations. By retrieving demonstrations from hybrid sources, ODIS leverages the advantages of both, showcasing its effectiveness compared to baseline methods that rely on a single data source. Furthermore, ODIS outperforms state-of-the-art approaches on two cross-domain text-to-SQL datasets, with improvements of 1.1 and 11.8 points in execution accuracy, respectively.

arxiv preprint arxiv, database, demonstration, (13 more...)

arXiv.org Artificial Intelligence

2310.06302

Country:

North America > United States > Alaska (0.04)
North America > United States > Alabama (0.04)
North America > United States > Ohio (0.04)
(7 more...)

Genre:

Research Report > Promising Solution (0.66)
Research Report > New Finding (0.48)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers

Lee, Chia-Hsuan, Polozov, Oleksandr, Richardson, Matthew

arXiv.org Artificial IntelligenceJun-21-2021

The goal of database question answering is to enable natural language querying of real-life relational databases in diverse application domains. Recently, large-scale datasets such as Spider and WikiSQL facilitated novel modeling techniques for text-to-SQL parsing, improving zero-shot generalization to unseen databases. In this work, we examine the challenges that still prevent these techniques from practical deployment. First, we present KaggleDBQA, a new cross-domain evaluation dataset of real Web databases, with domain-specific data types, original formatting, and unrestricted questions. Second, we re-examine the choice of evaluation tasks for text-to-SQL parsers as applied in real-life settings. Finally, we augment our in-domain evaluation task with database documentation, a naturally occurring source of implicit domain knowledge. We show that KaggleDBQA presents a challenge to state-of-the-art zero-shot parsers but a more realistic evaluation setting and creative use of associated database documentation boosts their accuracy by over 13.2%, doubling their performance.

database, dataset, kaggledbqa, (16 more...)

arXiv.org Artificial Intelligence

2106.11455

Country:

North America > United States > Wisconsin (0.04)
Asia > Japan (0.04)
North America > United States > Wyoming (0.04)
(10 more...)

Genre: Research Report (0.50)

Industry: Education (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.51)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback