Evaluating Language Models' Evaluations of Games

Open in new window