AITopics | Chang, Shuaichen

Collaborating Authors

Chang, Shuaichen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Better Understanding Table Instruction Tuning: Decoupling the Effects from Data versus Models

Deng, Naihao, Zhang, Sheng, Zhu, Henghui, Chang, Shuaichen, Zhang, Jiani, Li, Alexander Hanbo, Hang, Chung-Wei, Kobayashi, Hideo, Hu, Yiqun, Ng, Patrick

arXiv.org Artificial IntelligenceJan-24-2025

Recent advances in natural language processing have leveraged instruction tuning to enhance Large Language Models (LLMs) for table-related tasks. However, previous works train different base models with different training data, lacking an apples-to-apples comparison across the result table LLMs. To address this, we fine-tune base models from the Mistral, OLMo, and Phi families on existing public training datasets. Our replication achieves performance on par with or surpassing existing table LLMs, establishing new state-of-the-art performance on Hitab, a table question-answering dataset. More importantly, through systematic out-of-domain evaluation, we decouple the contributions of training data and the base model, providing insight into their individual impacts. In addition, we assess the effects of table-specific instruction tuning on general-purpose benchmarks, revealing trade-offs between specialization and generalization.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.14717

Country:

Asia (0.93)
Europe (0.68)
North America > United States > Texas (0.14)
North America > Mexico > Mexico City (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment > Sports (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

PRACTIQ: A Practical Conversational Text-to-SQL dataset with Ambiguous and Unanswerable Queries

Dong, Mingwen, Kumar, Nischal Ashok, Hu, Yiqun, Chauhan, Anuj, Hang, Chung-Wei, Chang, Shuaichen, Pan, Lin, Lan, Wuwei, Zhu, Henghui, Jiang, Jiarong, Ng, Patrick, Wang, Zhiguo

arXiv.org Artificial IntelligenceOct-14-2024

Previous text-to-SQL datasets and systems have primarily focused on user questions with clear intentions that can be answered. However, real user questions can often be ambiguous with multiple interpretations or unanswerable due to a lack of relevant data. In this work, we construct a practical conversational text-to-SQL dataset called PRACTIQ, consisting of ambiguous and unanswerable questions inspired by real-world user questions. We first identified four categories of ambiguous questions and four categories of unanswerable questions by studying existing text-to-SQL datasets. Then, we generate conversations with four turns: the initial user question, an assistant response seeking clarification, the user's clarification, and the assistant's clarified SQL response with the natural language explanation of the execution results. For some ambiguous queries, we also directly generate helpful SQL responses, that consider multiple aspects of ambiguity, instead of requesting user clarification. To benchmark the performance on ambiguous, unanswerable, and answerable questions, we implemented large language model (LLM)-based baselines using various LLMs. Our approach involves two steps: question category classification and clarification SQL prediction. Our experiments reveal that state-of-the-art systems struggle to handle ambiguous and unanswerable questions effectively. We will release our code for data generation and experiments on GitHub.

category, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.11076

Country:

Europe (0.67)
Asia (0.46)
North America > United States (0.28)

Genre:

Research Report (0.82)
Personal (0.67)

Industry:

Automobiles & Trucks > Manufacturer (1.00)
Leisure & Entertainment (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain Settings

Chang, Shuaichen, Fosler-Lussier, Eric

arXiv.org Artificial IntelligenceNov-26-2023

Large language models (LLMs) with in-context learning have demonstrated remarkable capability in the text-to-SQL task. Previous research has prompted LLMs with various demonstration-retrieval strategies and intermediate reasoning steps to enhance the performance of LLMs. However, those works often employ varied strategies when constructing the prompt text for text-to-SQL inputs, such as databases and demonstration examples. This leads to a lack of comparability in both the prompt constructions and their primary contributions. Furthermore, selecting an effective prompt construction has emerged as a persistent problem for future research. To address this limitation, we comprehensively investigate the impact of prompt constructions across various settings and provide insights into prompt constructions for future text-to-SQL studies.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.11853

Country: North America > United States (0.67)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Motorsports (0.68)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

Selective Demonstrations for Cross-domain Text-to-SQL

Chang, Shuaichen, Fosler-Lussier, Eric

arXiv.org Artificial IntelligenceOct-10-2023

Large language models (LLMs) with in-context learning have demonstrated impressive generalization capabilities in the cross-domain text-to-SQL task, without the use of in-domain annotations. However, incorporating in-domain demonstration examples has been found to greatly enhance LLMs' performance. In this paper, we delve into the key factors within in-domain examples that contribute to the improvement and explore whether we can harness these benefits without relying on in-domain annotations. Based on our findings, we propose a demonstration selection framework ODIS which utilizes both out-of-domain examples and synthetically generated in-domain examples to construct demonstrations. By retrieving demonstrations from hybrid sources, ODIS leverages the advantages of both, showcasing its effectiveness compared to baseline methods that rely on a single data source. Furthermore, ODIS outperforms state-of-the-art approaches on two cross-domain text-to-SQL datasets, with improvements of 1.1 and 11.8 points in execution accuracy, respectively.

demonstration, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2310.06302

Country: North America > United States (1.00)

Genre:

Research Report > Promising Solution (0.66)
Research Report > New Finding (0.48)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness

Chang, Shuaichen, Wang, Jun, Dong, Mingwen, Pan, Lin, Zhu, Henghui, Li, Alexander Hanbo, Lan, Wuwei, Zhang, Sheng, Jiang, Jiarong, Lilien, Joseph, Ash, Steve, Wang, William Yang, Wang, Zhiguo, Castelli, Vittorio, Ng, Patrick, Xiang, Bing

arXiv.org Artificial IntelligenceJan-28-2023

Neural text-to-SQL models have achieved remarkable performance in translating natural language questions into SQL queries. However, recent studies reveal that text-to-SQL models are vulnerable to task-specific perturbations. Previous curated robustness test sets usually focus on individual phenomena. In this paper, we propose a comprehensive robustness benchmark based on Spider, a cross-domain text-to-SQL benchmark, to diagnose the model robustness. We design 17 perturbations on databases, natural language questions, and SQL queries to measure the robustness from different angles. In order to collect more diversified natural question perturbations, we utilize large pretrained language models (PLMs) to simulate human behaviors in creating natural questions. We conduct a diagnostic study of the state-of-the-art models on the robustness set. Experimental results reveal that even the most robust model suffers from a 14.0% performance drop overall and a 50.7% performance drop on the most challenging perturbation. We also present a breakdown analysis regarding text-to-SQL model designs and provide insights for improving model robustness.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2301.08881

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.46)
Consumer Products & Services (0.46)
Transportation > Air (0.46)
Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.46)

Add feedback

MapQA: A Dataset for Question Answering on Choropleth Maps

Chang, Shuaichen, Palzer, David, Li, Jialin, Fosler-Lussier, Eric, Xiao, Ningchuan

arXiv.org Artificial IntelligenceNov-15-2022

Choropleth maps are a common visual representation for region-specific tabular data and are used in a number of different venues (newspapers, articles, etc). These maps are human-readable but are often challenging to deal with when trying to extract data for screen readers, analyses, or other related tasks. Recent research into Visual-Question Answering (VQA) has studied question answering on human-generated charts (ChartQA), such as bar, line, and pie charts. However, little work has paid attention to understanding maps; general VQA models, and ChartQA models, suffer when asked to perform this task. To facilitate and encourage research in this area, we present MapQA, a large-scale dataset of ~800K question-answer pairs over ~60K map images. Our task tests various levels of map understanding, from surface questions about map styles to complex questions that require reasoning on the underlying data. We present the unique challenges of MapQA that frustrate most strong baseline algorithms designed for ChartQA and general VQA tasks. We also present a novel algorithm, Visual Multi-Output Data Extraction based QA (V-MODEQA) for MapQA. V-MODEQA extracts the underlying structured data from a map image with a multi-output model and then performs reasoning on the extracted data. Our experimental results show that V-MODEQA has better overall performance and robustness on MapQA than the state-of-the-art ChartQA and VQA algorithms by capturing the unique properties in map question answering.

machine learning, natural language, question answering, (15 more...)

arXiv.org Artificial Intelligence

2211.08545

Country: North America > United States > Montana (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Prefix-to-SQL: Text-to-SQL Generation from Incomplete User Questions

Deng, Naihao, Chang, Shuaichen, Shi, Peng, Yu, Tao, Zhang, Rui

arXiv.org Artificial IntelligenceSep-15-2021

Existing text-to-SQL research only considers complete questions as the input, but lay-users might strive to formulate a complete question. To build a smarter natural language interface to database systems (NLIDB) that also processes incomplete questions, we propose a new task, prefix-to-SQL which takes question prefix from users as the input and predicts the intended SQL. We construct a new benchmark called PAGSAS that contains 124K user question prefixes and the intended SQL for 5 sub-tasks Advising, GeoQuery, Scholar, ATIS, and Spider. Additionally, we propose a new metric SAVE to measure how much effort can be saved by users. Experimental results show that PAGSAS is challenging even for strong baseline models such as T5. As we observe the difficulty of prefix-to-SQL is related to the number of omitted tokens, we incorporate curriculum learning of feeding examples with an increasing number of omitted tokens. This improves scores on various sub-tasks by as much as 9% recall scores on sub-task GeoQuery in PAGSAS.

deep learning, neural network, sql query, (21 more...)

arXiv.org Artificial Intelligence

2109.13066

Country:

Europe (1.00)
North America > United States > New York (0.14)
North America > United States > Michigan (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback