Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

Open in new window