EscapeBench: Pushing Language Models to Think Outside the Box

Open in new window