UNO Arena for Evaluating Sequential Decision-Making Capability of Large Language Models