Searching for Snippets of Open-Domain Dialogue in Task-Oriented Dialogue Datasets
Stricker, Armand, Paroubek, Patrick
–arXiv.org Artificial Intelligence
Most existing dialogue corpora and models have been designed to fit into 2 predominant categories : task-oriented dialogues portray functional goals, such as making a restaurant reservation or booking a plane ticket, while chit-chat/open-domain dialogues focus on holding a socially engaging talk with a user. However, humans tend to seamlessly switch between modes and even use chitchat to enhance task-oriented conversations. To bridge this gap, new datasets have recently been created, blending both communication modes into conversation examples. The approaches used tend to rely on adding chit-chat snippets to pre-existing, human-generated task-oriented datasets. Given the tendencies observed in humans, we wonder however if the latter do not \textit{already} hold chit-chat sequences. By using topic modeling and searching for topics which are most similar to a set of keywords related to social talk, we explore the training sets of Schema-Guided Dialogues and MultiWOZ. Our study shows that sequences related to social talk are indeed naturally present, motivating further research on ways chitchat is combined into task-oriented dialogues.
arXiv.org Artificial Intelligence
Nov-23-2023
- Country:
- Europe
- Belgium (0.04)
- France (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- North America > United States
- New York > New York County
- New York City (0.04)
- Texas (0.04)
- New York > New York County
- Europe
- Genre:
- Research Report (0.82)
- Industry:
- Consumer Products & Services > Restaurants (0.34)
- Health & Medicine (0.68)
- Leisure & Entertainment (0.68)
- Media (0.46)
- Technology: