ChatQA: Building GPT-4 Level Conversational QA Models

Liu, Zihan, Ping, Wei, Roy, Rajarshi, Xu, Peng, Lee, Chankyu, Shoeybi, Mohammad, Catanzaro, Bryan

Jan-23-2024–arXiv.org Artificial Intelligence

In this work, we introduce ChatQA, a family of conversational question answering (QA) models that obtain GPT-4 level accuracies. Specifically, we propose a two-stage instruction tuning method that can significantly improve the zero-shot conversational QA results from large language models (LLMs). To handle retrieval-augmented generation in conversational QA, we fine-tune a dense retriever on a multi-turn QA dataset, which provides comparable results to using the state-of-the-art query rewriting model while largely reducing deployment cost. Notably, our ChatQA-70B can outperform GPT-4 in terms of average score on 10 conversational QA datasets (54.14 vs. 53.90), without relying on any synthetic data from OpenAI GPT models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Jan-23-2024

arXiv.org PDF

Add feedback

Country:
- Europe
  - France (0.28)
  - Germany (0.46)

Genre:
- Research Report (0.40)

Industry:
- Health & Medicine > Therapeutic Area
  - Cardiology/Vascular Diseases (0.93)
  - Endocrinology (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language
    - Chatbot (1.00)
    - Large Language Model (1.00)