Diagnosing Failures in Large Language Models' Answers: Integrating Error Attribution into Evaluation Framework

Open in new window