Evaluating Open-Domain Dialogues in Latent Space with Next Sentence Prediction and Mutual Information
Zhao, Kun, Yang, Bohao, Lin, Chenghua, Rong, Wenge, Villavicencio, Aline, Cui, Xiaohui
–arXiv.org Artificial Intelligence
The long-standing one-to-many issue of the open-domain dialogues poses significant challenges for automatic evaluation methods, i.e., there may be multiple suitable responses which differ in semantics for a given conversational context. To tackle this challenge, we propose a novel learning-based automatic evaluation metric (CMN), which can robustly evaluate open-domain dialogues by augmenting Conditional Variational Autoencoders (CVAEs) with a Next Sentence Prediction (NSP) objective and employing Mutual Information (MI) to model the semantic similarity of text in the latent space. Experimental results on two open-domain dialogue datasets demonstrate the superiority of our method compared with a wide range of baselines, especially in handling responses which are distant to the golden reference responses in semantics.
arXiv.org Artificial Intelligence
Jun-10-2023
- Country:
- Asia > China
- Hubei Province > Wuhan (0.04)
- Europe
- Italy > Tuscany
- Florence (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- United Kingdom > England
- South Yorkshire > Sheffield (0.04)
- Italy > Tuscany
- North America > United States
- Michigan (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Oceania > Australia
- South America > Colombia
- Meta Department > Villavicencio (0.04)
- Asia > China
- Genre:
- Research Report (0.50)
- Technology: