Pragmatically Appropriate Diversity for Dialogue Evaluation