Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models
Lin, Sheng-Chieh, Yang, Jheng-Hong, Nogueira, Rodrigo, Tsai, Ming-Feng, Wang, Chuan-Ju, Lin, Jimmy
–arXiv.org Artificial Intelligence
This paper presents an empirical study of conversational question reformulation (CQR) with sequence-to-sequence architectures and pretrained language models (PLMs). We leverage PLMs to address the strong token-to-token independence assumption made in the common objective, maximum likelihood estimation, for the CQR task. In CQR benchmarks of task-oriented dialogue systems, we evaluate fine-tuned PLMs on the recently-introduced CANARD dataset as an in-domain task and validate the models using data from the TREC 2019 CAsT Track as an out-domain task. Examining a variety of architectures with different numbers of parameters, we demonstrate that the recent text-to-text transfer transformer (T5) achieves the best results both on CANARD and CAsT with fewer parameters, compared to similar transformer architectures.
arXiv.org Artificial Intelligence
Apr-4-2020