MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures Jinjie Ni

Open in new window