A Multi-Agent Pokemon Tournament for Evaluating Strategic Reasoning of Large Language Models

Open in new window