From Text to CQL: Bridging Natural Language and Corpus Search Engine
Lu, Luming, An, Jiyuan, Wang, Yujie, yang, Liner, Kong, Cunliang, Liu, Zhenghao, Wang, Shuo, Lin, Haozhe, Fang, Mingwei, Huang, Yaping, Yang, Erhong
–arXiv.org Artificial Intelligence
Natural Language Processing (NLP) technologies have revolutionized the way we interact with information systems, with a significant focus on converting natural language queries into formal query languages such as SQL. However, less emphasis has been placed on the Corpus Query Language (CQL), a critical tool for linguistic research and detailed analysis within text corpora. The manual construction of CQL queries is a complex and time-intensive task that requires a great deal of expertise, which presents a notable challenge for both researchers and practitioners. This paper presents the first text-to-CQL task that aims to automate the translation of natural language into CQL. We present a comprehensive framework for this task, including a specifically curated large-scale dataset and methodologies leveraging large language models (LLMs) for effective text-to-CQL task. In addition, we established advanced evaluation metrics to assess the syntactic and semantic accuracy of the generated queries. We created innovative LLM-based conversion approaches and detailed experiments. The results demonstrate the efficacy of our methods and provide insights into the complexities of text-to-CQL task.
arXiv.org Artificial Intelligence
Feb-21-2024
- Country:
- Asia > China (0.14)
- Europe > Belgium (0.14)
- North America > United States (0.14)
- Genre:
- Research Report > New Finding (0.34)
- Technology: