LLMs as Agentic Cooperative Players in Multiplayer UNO

Sep-15-2025–arXiv.org Artificial Intelligence

Third, the current game state data--number of players, last played card, hand contents, next player, recent moves, and legal actions. Finally, the LLM was asked to choose the best action according to the specified prompting method. The game state information was extracted from RLCard and reformatted for readability. While RLCard encodes cards using shorthand (e.g., "r-5" for red 5), we expanded these into full descriptions to improve the model's comprehension. An example of the complete prompt format is shown in Figure 3. To drive the model's action selection, we applied two prompting strategies inspired by Moore et al. [17]: cloze prompting and counterfactual prompting. These methods determine how the model interprets the prompt and evaluates its legal actions during gameplay. Cloze Prompting: In this method, legal actions were labeled with sequential letters (A, B, C, etc.), and the LLM was instructed to choose the letter corresponding to the best move. Only one token was allowed in the output, and the highest-probability token from the set of allowable actions was selected as the action.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Sep-15-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Austria (0.28)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Law (1.00)
- Leisure & Entertainment > Games
  - Computer Games (0.69)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.99)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found