DIALGEN: Collaborative Human-LM Generated Dialogues for Improved Understanding of Human-Human Conversations
Lu, Bo-Ru, Haduong, Nikita, Lee, Chia-Hsuan, Wu, Zeqiu, Cheng, Hao, Koester, Paul, Utke, Jean, Yu, Tao, Smith, Noah A., Ostendorf, Mari
–arXiv.org Artificial Intelligence
Applications that could benefit from automatic understanding of human-human conversations often come with challenges associated with private information in real-world data such as call center or clinical conversations. Working with protected data also increases costs of annotation, which limits technology development. To address these challenges, we propose DIALGEN, a human-in-the-loop semi-automated dialogue generation framework. DIALGEN uses a language model (ChatGPT) that can follow schema and style specifications to produce fluent conversational text, generating a complex conversation through iteratively generating subdialogues and using human feedback to correct inconsistencies or redirect the flow. In experiments on structured summarization of agent-client information gathering calls, framed as dialogue state tracking, we show that DIALGEN data enables significant improvement in model performance.
arXiv.org Artificial Intelligence
Jul-13-2023
- Country:
- Europe (1.00)
- North America > United States
- California
- Los Angeles County > Santa Monica (0.14)
- San Francisco County > San Francisco (0.14)
- California
- Genre:
- Research Report (1.00)
- Industry:
- Automobiles & Trucks > Manufacturer (1.00)
- Banking & Finance > Insurance (0.93)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (0.67)
- Transportation
- Ground > Road (1.00)
- Infrastructure & Services (0.68)
- Passenger (1.00)