IQA-E VAL: Automatic Evaluation of Human-Model Interactive Question Answering

Neural Information Processing Systems 

To evaluate Large Language Models (LLMs) for question answering (QA), traditional methods typically focus on assessing single-turn responses to given questions. However, this approach doesn't capture the dynamic nature of human-AI interactions, where humans actively seek information through conversation.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found