AITopics | Lan, Wuwei

Collaborating Authors

Lan, Wuwei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PRACTIQ: A Practical Conversational Text-to-SQL dataset with Ambiguous and Unanswerable Queries

Dong, Mingwen, Kumar, Nischal Ashok, Hu, Yiqun, Chauhan, Anuj, Hang, Chung-Wei, Chang, Shuaichen, Pan, Lin, Lan, Wuwei, Zhu, Henghui, Jiang, Jiarong, Ng, Patrick, Wang, Zhiguo

arXiv.org Artificial IntelligenceOct-14-2024

Previous text-to-SQL datasets and systems have primarily focused on user questions with clear intentions that can be answered. However, real user questions can often be ambiguous with multiple interpretations or unanswerable due to a lack of relevant data. In this work, we construct a practical conversational text-to-SQL dataset called PRACTIQ, consisting of ambiguous and unanswerable questions inspired by real-world user questions. We first identified four categories of ambiguous questions and four categories of unanswerable questions by studying existing text-to-SQL datasets. Then, we generate conversations with four turns: the initial user question, an assistant response seeking clarification, the user's clarification, and the assistant's clarified SQL response with the natural language explanation of the execution results. For some ambiguous queries, we also directly generate helpful SQL responses, that consider multiple aspects of ambiguity, instead of requesting user clarification. To benchmark the performance on ambiguous, unanswerable, and answerable questions, we implemented large language model (LLM)-based baselines using various LLMs. Our approach involves two steps: question category classification and clarification SQL prediction. Our experiments reveal that state-of-the-art systems struggle to handle ambiguous and unanswerable questions effectively. We will release our code for data generation and experiments on GitHub.

category, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.11076

Country:

Europe (0.67)
Asia (0.46)
North America > United States (0.28)

Genre:

Research Report (0.82)
Personal (0.67)

Industry:

Automobiles & Trucks > Manufacturer (1.00)
Leisure & Entertainment (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

UNITE: A Unified Benchmark for Text-to-SQL Evaluation

Lan, Wuwei, Wang, Zhiguo, Chauhan, Anuj, Zhu, Henghui, Li, Alexander, Guo, Jiang, Zhang, Sheng, Hang, Chung-Wei, Lilien, Joseph, Hu, Yiqun, Pan, Lin, Dong, Mingwen, Wang, Jun, Jiang, Jiarong, Ash, Stephen, Castelli, Vittorio, Ng, Patrick, Xiang, Bing

arXiv.org Artificial IntelligenceJul-14-2023

A practical text-to-SQL system should generalize well on a wide variety of natural language questions, unseen database schemas, and novel SQL query structures. To comprehensively evaluate text-to-SQL systems, we introduce a UNIfied benchmark for Text-to-SQL Evaluation (UNITE). It is composed of publicly available text-to-SQL datasets, containing natural language questions from more than 12 domains, SQL queries from more than 3.9K patterns, and 29K databases. Compared to the widely used Spider benchmark, we introduce $\sim$120K additional examples and a threefold increase in SQL patterns, such as comparative and boolean questions. We conduct a systematic study of six state-of-the-art (SOTA) text-to-SQL parsers on our new benchmark and show that: 1) Codex performs surprisingly well on out-of-domain datasets; 2) specially designed decoding methods (e.g. constrained beam search) can improve performance for both in-domain and out-of-domain settings; 3) explicitly modeling the relationship between questions and schemas further improves the Seq2Seq models. More importantly, our benchmark presents key challenges towards compositional generalization and robustness issues -- which these SOTA models cannot address well. Our code and data processing script are available at https://github.com/awslabs/unified-text2sql-benchmark

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.16265

Country:

Europe (0.93)
North America > United States > Louisiana (0.14)

Genre: Research Report (0.64)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness

Chang, Shuaichen, Wang, Jun, Dong, Mingwen, Pan, Lin, Zhu, Henghui, Li, Alexander Hanbo, Lan, Wuwei, Zhang, Sheng, Jiang, Jiarong, Lilien, Joseph, Ash, Steve, Wang, William Yang, Wang, Zhiguo, Castelli, Vittorio, Ng, Patrick, Xiang, Bing

arXiv.org Artificial IntelligenceJan-28-2023

Neural text-to-SQL models have achieved remarkable performance in translating natural language questions into SQL queries. However, recent studies reveal that text-to-SQL models are vulnerable to task-specific perturbations. Previous curated robustness test sets usually focus on individual phenomena. In this paper, we propose a comprehensive robustness benchmark based on Spider, a cross-domain text-to-SQL benchmark, to diagnose the model robustness. We design 17 perturbations on databases, natural language questions, and SQL queries to measure the robustness from different angles. In order to collect more diversified natural question perturbations, we utilize large pretrained language models (PLMs) to simulate human behaviors in creating natural questions. We conduct a diagnostic study of the state-of-the-art models on the robustness set. Experimental results reveal that even the most robust model suffers from a 14.0% performance drop overall and a 50.7% performance drop on the most challenging perturbation. We also present a breakdown analysis regarding text-to-SQL model designs and provide insights for improving model robustness.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2301.08881

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.46)
Consumer Products & Services (0.46)
Transportation > Air (0.46)
Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.46)

Add feedback

Importance of Synthesizing High-quality Data for Text-to-SQL Parsing

Zhao, Yiyun, Jiang, Jiarong, Hu, Yiqun, Lan, Wuwei, Zhu, Henry, Chauhan, Anuj, Li, Alexander, Pan, Lin, Wang, Jun, Hang, Chung-Wei, Zhang, Sheng, Dong, Marvin, Lilien, Joe, Ng, Patrick, Wang, Zhiguo, Castelli, Vittorio, Xiang, Bing

arXiv.org Artificial IntelligenceDec-16-2022

Recently, there has been increasing interest in synthesizing data to improve downstream text-to-SQL tasks. In this paper, we first examined the existing synthesized datasets and discovered that state-of-the-art text-to-SQL algorithms did not further improve on popular benchmarks when trained with augmented synthetic data. We observed two shortcomings: illogical synthetic SQL queries from independent column sampling and arbitrary table joins. To address these issues, we propose a novel synthesis framework that incorporates key relationships from schema, imposes strong typing, and conducts schema-distance-weighted column sampling. We also adopt an intermediate representation (IR) for the SQL-to-text task to further improve the quality of the generated natural language questions. When existing powerful semantic parsers are pre-finetuned on our high-quality synthesized data, our experiments show that these models have significant accuracy boosts on popular benchmarks, including new state-of-the-art performance on Spider.

artificial intelligence, computational linguistic, natural language, (14 more...)

arXiv.org Artificial Intelligence

2212.08785

Country:

Europe (1.00)
North America > United States (0.93)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Travel Time Estimation without Road Networks: An Urban Morphological Layout Representation Approach

Lan, Wuwei, Xu, Yanyan, Zhao, Bin

arXiv.org Artificial IntelligenceJul-7-2019

Travel time estimation is a crucial task for not only personal travel scheduling but also city planning. Previous methods focus on modeling toward road segments or sub-paths, then summing up for a final prediction, which have been recently replaced by deep neural models with end-to-end training. Usually, these methods are based on explicit feature representations, including spatio-temporal features, traffic states, etc. Here, we argue that the local traffic condition is closely tied up with the land-use and built environment, i.e., metro stations, arterial roads, intersections, commercial area, residential area, and etc, yet the relation is time-varying and too complicated to model explicitly and efficiently. Thus, this paper proposes an end-to-end multi-task deep neural model, named Deep Image to Time (DeepI2T), to learn the travel time mainly from the built environment images, a.k.a. the morphological layout images, and showoff the new state-of-the-art performance on real-world datasets in two cities. Moreover, our model is designed to tackle both path-aware and path-blind scenarios in the testing phase. This work opens up new opportunities of using the publicly available morphological layout images as considerable information in multiple geography-related smart city applications.

deep learning, ground transportation, neural network, (20 more...)

arXiv.org Artificial Intelligence

1907.03381

Country:

Asia > China (0.48)
North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.50)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback