A Third Paradigm for LLM Evaluation: Dialogue Game-Based Evaluation using clembench

Open in new window