Multi-Document Grounded Multi-Turn Synthetic Dialog Generation

Lee, Young-Suk, Gunasekara, Chulaka, Contractor, Danish, Astudillo, Ramón Fernandez, Florian, Radu

Sep-17-2024–arXiv.org Artificial Intelligence

For multi-document grounded dialog generation, As instruction-tuned language models have proven user queries and agent answers are based on top-k highly effective to generalize to new tasks, (Chung retrieved passages. In particular, we generate an et al., 2022; Wei et al., 2021; Ouyang et al., 2022; initial user query from a single document source Mishra et al., 2022; Wang et al., 2022b), there has and generate the agent answer from top-k passages been growing interest to acquire synthetic data sets retrieved on the initial user query. Subsequent generated from pre-trained language models with a user queries and all agent answers are grounded minimal or no human supervision, (Honovich et al., on the retrieved passages and dialog history. We 2022; Wang et al., 2023; Xu et al., 2023; Lee et al., use a series of carefully designed prompts to ensure 2023). While there has been an exploration of synthetic generated agent answers continue to remain data generation for persona-grounded dialog meaningful in the presence of retrieved passages, generation (Jang et al., 2022; Bao et al., 2023), often noisier than human generated documents.

dialog, information, query, (16 more...)

arXiv.org Artificial Intelligence

Sep-17-2024

arXiv.org PDF

Add feedback

Country:
- North America > Dominican Republic (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (1.00)

Industry:
- Banking & Finance > Trading (0.46)
- Leisure & Entertainment > Sports (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.68)
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found