Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models

Lin, Sheng-Chieh, Yang, Jheng-Hong, Nogueira, Rodrigo, Tsai, Ming-Feng, Wang, Chuan-Ju, Lin, Jimmy

Apr-4-2020–arXiv.org Artificial Intelligence

This paper presents an empirical study of conversational question reformulation (CQR) with sequence-to-sequence architectures and pretrained language models (PLMs). We leverage PLMs to address the strong token-to-token independence assumption made in the common objective, maximum likelihood estimation, for the CQR task. In CQR benchmarks of task-oriented dialogue systems, we evaluate fine-tuned PLMs on the recently-introduced CANARD dataset as an in-domain task and validate the models using data from the TREC 2019 CAsT Track as an out-domain task. Examining a variety of architectures with different numbers of parameters, we demonstrate that the recent text-to-text transfer transformer (T5) achieves the best results both on CANARD and CAsT with fewer parameters, compared to similar transformer architectures.

dataset, proc, sequence-to-sequence architecture, (10 more...)

arXiv.org Artificial Intelligence

Apr-4-2020

arXiv.org PDF

Add feedback

Country:
- North America > Canada (0.05)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.56)
  - Machine Learning
    - Neural Networks > Deep Learning (0.54)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.56)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found