UNO Arena for Evaluating Sequential Decision-Making Capability of Large Language Models

Open in new window