Resurrecting saturated LLM benchmarks with adversarial encoding

Open in new window