LLMs as Agentic Cooperative Players in Multiplayer UNO
Matinez, Yago Romano, Roberts, Jesse
–arXiv.org Artificial Intelligence
Third, the current game state data--number of players, last played card, hand contents, next player, recent moves, and legal actions. Finally, the LLM was asked to choose the best action according to the specified prompting method. The game state information was extracted from RLCard and reformatted for readability. While RLCard encodes cards using shorthand (e.g., "r-5" for red 5), we expanded these into full descriptions to improve the model's comprehension. An example of the complete prompt format is shown in Figure 3. To drive the model's action selection, we applied two prompting strategies inspired by Moore et al. [17]: cloze prompting and counterfactual prompting. These methods determine how the model interprets the prompt and evaluates its legal actions during gameplay. Cloze Prompting: In this method, legal actions were labeled with sequential letters (A, B, C, etc.), and the LLM was instructed to choose the letter corresponding to the best move. Only one token was allowed in the output, and the highest-probability token from the set of allowable actions was selected as the action.
arXiv.org Artificial Intelligence
Sep-15-2025
- Country:
- Europe > Austria
- Vienna (0.14)
- North America > United States
- Florida > Miami-Dade County > Miami (0.04)
- Europe > Austria
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Law (1.00)
- Leisure & Entertainment > Games
- Computer Games (0.69)
- Technology: