Codenames as a Benchmark for Large Language Models