Multi-Document Grounded Multi-Turn Synthetic Dialog Generation
Lee, Young-Suk, Gunasekara, Chulaka, Contractor, Danish, Astudillo, Ramón Fernandez, Florian, Radu
–arXiv.org Artificial Intelligence
For multi-document grounded dialog generation, As instruction-tuned language models have proven user queries and agent answers are based on top-k highly effective to generalize to new tasks, (Chung retrieved passages. In particular, we generate an et al., 2022; Wei et al., 2021; Ouyang et al., 2022; initial user query from a single document source Mishra et al., 2022; Wang et al., 2022b), there has and generate the agent answer from top-k passages been growing interest to acquire synthetic data sets retrieved on the initial user query. Subsequent generated from pre-trained language models with a user queries and all agent answers are grounded minimal or no human supervision, (Honovich et al., on the retrieved passages and dialog history. We 2022; Wang et al., 2023; Xu et al., 2023; Lee et al., use a series of carefully designed prompts to ensure 2023). While there has been an exploration of synthetic generated agent answers continue to remain data generation for persona-grounded dialog meaningful in the presence of retrieved passages, generation (Jang et al., 2022; Bao et al., 2023), often noisier than human generated documents.
arXiv.org Artificial Intelligence
Sep-17-2024
- Country:
- North America > Dominican Republic (0.04)
- Asia > Middle East
- Jordan (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Banking & Finance > Trading (0.46)
- Leisure & Entertainment > Sports (0.46)
- Technology: