Rethinking LLM Evaluation: Can We Evaluate LLMs with 200x Less Data?

Open in new window