Goto

Collaborating Authors

 game play


Exploring the Stratified Space Structure of an RL Game with the Volume Growth Transform

Curry, Justin, Lagasse, Brennan, Lam, Ngoc B., Cox, Gregory, Rosenbluth, David, Speranzon, Alberto

arXiv.org Artificial Intelligence

In this work, we explore the structure of the embedding space of a transformer model trained for playing a particular reinforcement learning (RL) game. Specifically, we investigate how a transformer-based Proximal Policy Optimization (PPO) model embeds visual inputs in a simple environment where an agent must collect "coins" while avoiding dynamic obstacles consisting of "spotlights. " By adapting Robinson et al. 's [15] study of the volume growth transform for LLMs to the RL setting, we find that the token embedding space for our visual coin collecting game is also not a manifold, and is better modeled as a stratified space, where local dimension can vary from point to point. We further strengthen Robinson's method by proving that fairly general volume growth curves can be realized by stratified spaces. Finally, we carry out an analysis that suggests that as an RL agent acts, its latent representation alternates between periods of low local dimension, while following a fixed sub-strategy, and bursts of high local dimension, where the agent achieves a sub-goal (e.g., collecting an object) or where the environmental complexity increases (e.g., more obstacles appear). Consequently, our work suggests that the distribution of dimensions in a stratified latent space may provide a new geometric indicator of complexity for RL games. 1


Looking for something new to spice up your game play? The Tinder of games is here

The Guardian

As any adult who loves video games knows, there are simply too many of them – 19,000 games were released in 2024 on PC games storefront Steam alone, not counting all the playable delights on consoles and smartphones. Most of us have backlogs of unplayed classics that make us feel guilty about buying newer games. Finding things that are actually good, meanwhile, can feel totally impossible. At least 50% of the questions people send in for this newsletter are a variant of "Help, what should I play?" We do our best to help, but even though it's my job to know about games, I still don't have infinite time to play them.


How a School Shooting Became a Video Game

The New Yorker

The Final Exam, a recently released video game in which you play as a student caught amid a school shooting, lasts for around ten minutes, about the length of a real shooting event in a U.S. school. The game opens in an empty locker room. You hear distant gunfire, screams, harried footsteps, and the thudding of heavy furniture being overturned. The sense of disharmony is immediate: a familiar scene of youth and learning is grimly debased into one of peril. As the lockers surround you, their doors gaping, you feel caged: get me out of here. Moments later, as you enter the gymnasium, a two-minute countdown flashes on screen.


Do you want to play a game? Learning to play Tic-Tac-Toe in Hypermedia Environments

Beaumont, Katharine, Collier, Rem

arXiv.org Artificial Intelligence

We demonstrate the integration of Transfer Learning into a hypermedia Multi-Agent System using the Multi-Agent MicroServices (MAMS) architectural style. Agents use RDF knowledge stores to reason over information and apply Reinforcement Learning techniques to learn how to interact with a Tic-Tac-Toe API. Agents form advisor-advisee relationships in order to speed up individual learning and exploit and learn from data on the Web.


People use fast, goal-directed simulation to reason about novel games

Zhang, Cedegao E., Collins, Katherine M., Wong, Lionel, Weller, Adrian, Tenenbaum, Joshua B.

arXiv.org Artificial Intelligence

We can evaluate features of problems and their potential solutions well before we can effectively solve them. When considering a game we have never played, for instance, we might infer whether it is likely to be challenging, fair, or fun simply from hearing the game rules, prior to deciding whether to invest time in learning the game or trying to play it well. Many studies of game play have focused on optimality and expertise, characterizing how people and computational models play based on moderate to extensive search and after playing a game dozens (if not thousands or millions) of times. Here, we study how people reason about a range of simple but novel connect-n style board games. We ask people to judge how fair and how fun the games are from very little experience: just thinking about the game for a minute or so, before they have ever actually played with anyone else, and we propose a resource-limited model that captures their judgments using only a small number of partial game simulations and almost no lookahead search.


clembench-2024: A Challenging, Dynamic, Complementary, Multilingual Benchmark and Underlying Flexible Framework for LLMs as Multi-Action Agents

Beyer, Anne, Chalamalasetti, Kranti, Hakimov, Sherzod, Madureira, Brielen, Sadler, Philipp, Schlangen, David

arXiv.org Artificial Intelligence

It has been established in recent work that Large Language Models (LLMs) can be prompted to "self-play" conversational games that probe certain capabilities (general instruction following, strategic goal orientation, language understanding abilities), where the resulting interactive game play can be automatically scored. In this paper, we take one of the proposed frameworks for setting up such game-play environments, and further test its usefulness as an evaluation instrument, along a number of dimensions: We show that it can easily keep up with new developments while avoiding data contamination, we show that the tests implemented within it are not yet saturated (human performance is substantially higher than that of even the best models), and we show that it lends itself to investigating additional questions, such as the impact of the prompting language on performance. We believe that the approach forms a good basis for making decisions on model choice for building applied interactive systems, and perhaps ultimately setting up a closed-loop development environment of system and simulated evaluator.


Prompting Fairness: Artificial Intelligence as Game Players

Henry, Jazmia

arXiv.org Artificial Intelligence

Utilitarian games such as dictator games to measure fairness have been studied in the social sciences for decades. These games have given us insight into not only how humans view fairness but also in what conditions the frequency of fairness, altruism and greed increase or decrease. While these games have traditionally been focused on humans, the rise of AI gives us the ability to study how these models play these games. AI is becoming a constant in human interaction and examining how these models portray fairness in game play can give us some insight into how AI makes decisions. Over 101 rounds of the dictator game, I conclude that AI has a strong sense of fairness that is dependant of it it deems the person it is playing with as trustworthy, framing has a strong effect on how much AI gives a recipient when designated the trustee, and there may be evidence that AI experiences inequality aversion just as humans.


Clembench: Using Game Play to Evaluate Chat-Optimized Language Models as Conversational Agents

Chalamalasetti, Kranti, Götze, Jana, Hakimov, Sherzod, Madureira, Brielen, Sadler, Philipp, Schlangen, David

arXiv.org Artificial Intelligence

Recent work has proposed a methodology for the systematic evaluation of "Situated Language Understanding Agents"-agents that operate in rich linguistic and non-linguistic contexts-through testing them in carefully constructed interactive settings. Other recent work has argued that Large Language Models (LLMs), if suitably set up, can be understood as (simulators of) such agents. A connection suggests itself, which this paper explores: Can LLMs be evaluated meaningfully by exposing them to constrained game-like settings that are built to challenge specific capabilities? As a proof of concept, this paper investigates five interaction settings, showing that current chat-optimised LLMs are, to an extent, capable to follow game-play instructions. Both this capability and the quality of the game play, measured by how well the objectives of the different games are met, follows the development cycle, with newer models performing better. The metrics even for the comparatively simple example games are far from being saturated, suggesting that the proposed instrument will remain to have diagnostic value. Our general framework for implementing and evaluating games with LLMs is available at https://github.com/clembench .


Esports Data-to-commentary Generation on Large-scale Data-to-text Dataset

Wang, Zihan, Yoshinaga, Naoki

arXiv.org Artificial Intelligence

Esports, a sports competition using video games, has become one of the most important sporting events in recent years. Although the amount of esports data is increasing than ever, only a small fraction of those data accompanies text commentaries for the audience to retrieve and understand the plays. Therefore, in this study, we introduce a task of generating game commentaries from structured data records to address the problem. We first build a large-scale esports data-to-text dataset using structured data and commentaries from a popular esports game, League of Legends. On this dataset, we devise several data preprocessing methods including linearization and data splitting to augment its quality. We then introduce several baseline encoder-decoder models and propose a hierarchical model to generate game commentaries. Considering the characteristics of esports commentaries, we design evaluation metrics including three aspects of the output: correctness, fluency, and strategic depth. Experimental results on our large-scale esports dataset confirmed the advantage of the hierarchical model, and the results revealed several challenges of this novel task.


An adaptive music generation architecture for games based on the deep learning Transformer mode

Santos, Gustavo Amaral Costa dos, Baffa, Augusto, Briot, Jean-Pierre, Feijó, Bruno, Furtado, Antonio Luz

arXiv.org Artificial Intelligence

This paper presents an architecture for generating music for video games based on the Transformer deep learning model. Our motivation is to be able to customize the generation according to the taste of the player, who can select a corpus of training examples, corresponding to his preferred musical style. The system generates various musical layers, following the standard layering strategy currently used by composers designing video game music. To adapt the music generated to the game play and to the player(s) situation, we are using an arousal-valence model of emotions, in order to control the selection of musical layers. We discuss current limitations and prospects for the future, such as collaborative and interactive control of the musical components.