Resurrecting saturated LLM benchmarks with adversarial encoding