Pragmatically Appropriate Diversity for Dialogue Evaluation
Stasaski, Katherine, Hearst, Marti A.
–arXiv.org Artificial Intelligence
Linguistic pragmatics state that a conversation's underlying speech acts can constrain the type of response which is appropriate at each turn in the conversation. When generating dialogue responses, neural dialogue agents struggle to produce diverse responses. Currently, dialogue diversity is assessed using automatic metrics, but the underlying speech acts do not inform these metrics. To remedy this, we propose the notion of Pragmatically Appropriate Diversity, defined as the extent to which a conversation creates and constrains the creation of multiple diverse responses. Using a human-created multi-response dataset, we find significant support for the hypothesis that speech acts provide a signal for the diversity of the set of next responses. Building on this result, we propose a new human evaluation task where creative writers predict the extent to which conversations inspire the creation of multiple diverse responses. Our studies find that writers' judgments align with the Pragmatically Appropriate Diversity of conversations. Our work suggests that expectations for diversity metric scores should vary depending on the speech act.
arXiv.org Artificial Intelligence
Apr-5-2023
- Country:
- Asia
- Europe
- Italy > Tuscany
- Florence (0.04)
- Slovenia (0.04)
- Spain > Valencian Community
- Valencia Province > Valencia (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Italy > Tuscany
- North America
- Canada (0.04)
- Central America (0.04)
- Costa Rica (0.04)
- Dominican Republic (0.04)
- United States
- California > San Diego County
- San Diego (0.04)
- Colorado > Boulder County
- Boulder (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- New York City (0.14)
- Washington > King County
- Seattle (0.04)
- California > San Diego County
- South America (0.04)
- Genre:
- Personal > Interview (1.00)
- Research Report > Experimental Study (0.69)
- Industry:
- Information Technology (0.46)
- Technology: