Generating Regular Expressions from Natural Language Specifications: Are We There Yet?

Zhong, Zexuan (University of Illinois at Urbana-Champaign) | Guo, Jiaqi (Xi’an Jiaotong University) | Yang, Wei (University of Illinois at Urbana-Champaign) | Xie, Tao (University of Illinois at Urbana-Champaign) | Lou, Jian-Guang (Microsoft Research Asia) | Liu, Ting (Xi’an Jiaotong University) | Zhang, Dongmei (Microsoft Research Asia)

Apr-6-2018–AAAI Conferences

Recent state-of-the-art approaches automatically generate regular expressions from natural language specifications. Given that these approaches use only synthetic data in both training datasets and validation/test datasets, a natural question arises: are these approaches effective to address various real-world situations? To explore this question, in this paper, we conduct a characteristic study on comparing two synthetic datasets used by the recent research and a real-world dataset collected from the Internet, and conduct an experimental study on applying a state-of-the-art approach on the real-world dataset. Our study results suggest the existence of distinct characteristics between the synthetic datasets and the real-world dataset, and the state-of-the-art approach (based on a model trained from a synthetic dataset) achieves extremely low effectiveness when evaluated on real-world data, much lower than the effectiveness when evaluated on the synthetic dataset. We also provide initial analysis on some of those challenging cases and discuss future directions.

artificial intelligence, generating regular expression, natural language specification

AAAI Conferences

Apr-6-2018

Conferences PDF

Add feedback

Genre:
- Research Report > New Finding (0.53)

Technology:
- Information Technology
  - Software (0.60)
  - Artificial Intelligence > Natural Language (0.60)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found