Divide and conquer: How Microsoft researchers used AI to master Ms. Pac-Man - Next at Microsoft


Microsoft researchers have created an artificial intelligence-based system that learned how to get the maximum score on the addictive 1980s video game Ms. Pac-Man, using a divide-and-conquer method that could have broad implications for teaching AI agents to do complex tasks that augment human capabilities. The team from Maluuba, a Canadian deep learning startup acquired by Microsoft earlier this year, used a branch of AI called reinforcement learning to play the Atari 2600 version of Ms. Pac-Man perfectly. Using that method, the team achieved the maximum score possible of 999,990. Doina Precup, an associate professor of computer science at McGill University in Montreal said that's a significant achievement among AI researchers, who have been using various videogames to test their systems but have found Ms. Pac-Man among the most difficult to crack. But Precup said she was impressed not just with what the researchers achieved but with how they achieved it.

Recurrent Deterministic Policy Gradient Method for Bipedal Locomotion on Rough Terrain Challenge

arXiv.org Artificial Intelligence

This paper presents a deep learning framework that is capable of solving partially observable locomotion tasks based on our novel Recurrent Deterministic Policy Gradient (RDPG). Three major improvements are applied in our RDPG based learning framework: asynchronized backup of interpolated temporal difference, initialisation of hidden state using past trajectory scanning, and injection of external experiences learned by other agents. The proposed learning framework was implemented to solve the Bipedal-Walker challenge in OpenAI's gym simulation environment where only partial state information is available. Our simulation study shows that the autonomous behaviors generated by the RDPG agent are highly adaptive to a variety of obstacles and enables the agent to traverse rugged terrains effectively.

Generative Adversarial Network Rooms in Generative Graph Grammar Dungeons for The Legend of Zelda

arXiv.org Artificial Intelligence

-- Generative Adversarial Networks (GANs) have demonstrated their ability to learn patterns in data and produce new exemplars similar to, but different from, their training set in several domains, including video games. However, GANs have a fixed output size, so creating levels of arbitrary size for a dungeon crawling game is difficult. GANs also have trouble encoding semantic requirements that make levels interesting and playable. This paper combines a GAN approach to generating individual rooms with a graph grammar approach to combining rooms into a dungeon. The GAN captures design principles of individual rooms, but the graph grammar organizes rooms into a global layout with a sequence of obstacles determined by a designer . Room data from The Legend of Zelda is used to train the GAN. This approach is validated by a user study, showing that GAN dungeons are as enjoyable to play as a level from the original game, and levels generated with a graph grammar alone. However, GAN dungeons have rooms considered more complex, and plain graph grammar's dungeons are considered least complex and challenging. Only the GAN approach creates an extensive supply of both layouts and rooms, where rooms span across the spectrum of those seen in the training set to new creations merging design principles from multiple rooms. Video game developers increase replayability and reduce costs using Procedural Content Generation (PCG [1]). Instead of experiencing the game once, players see new variations on every playthrough. This concept was introduced in Rouge (1980), which procedurally generates new dungeons on every play. PCG is also applied to modern games like Minecraft (2009), where users play on generated landscapes, and No Man's Sky (2016), where procedurally generated worlds contain procedurally generated animals. PCG encourages increased exploration and increases replayability. An emerging PCG technique is Generative Adversarial Networks (GANs [2]) used to search the latent design space of video game levels, as has been done in Super Mario Bros. [3], Doom [4], an educational game [5], and the General Video Game AI (GVG-AI [6]) adaptation of The Legend of Zelda [7].

Affect-Based Early Prediction of Player Mental Demand and Engagement for Educational Games

AAAI Conferences

Player affect is a central consideration in the design of game-based learning environments. Affective indicators such as facial expressions exhibited during gameplay may support building more robust player models and adaptation modules. In game-based learning, predicting player mental demand and engagement from player affect is a particularly promising approach to helping create more effective gameplay. This paper reports on a predictive player-modeling approach that observes player affect during early interactions with a game-based learning environment and predicts selfreports of mental demand and engagement at the conclusion of gameplay sessions. The findings show that automatically detected facial expressions such as those associated with joy, disgust, sadness, and surprise are significant predictors of players' self-reported engagement and mental demand at the end of gameplay interactions. The results suggest that it is possible to create affect-based predictive player models that can enable proactively tailored gameplay by anticipating player mental demand and engagement.

Time to Fold, Humans: Poker-Playing AI Beats Pros at Texas Hold'em


It is no mystery why poker is such a popular pastime: the dynamic card game produces drama in spades as players are locked in a complicated tango of acting and reacting that becomes increasingly tense with each escalating bet. The same elements that make poker so entertaining have also created a complex problem for artificial intelligence (AI). A study published today in Science describes an AI system called DeepStack that recently defeated professional human players in heads-up, no-limit Texas hold'em poker, an achievement that represents a leap forward in the types of problems AI systems can solve. DeepStack, developed by researchers at the University of Alberta, relies on the use of artificial neural networks that researchers trained ahead of time to develop poker intuition. During play, DeepStack uses its poker smarts to break down a complicated game into smaller, more manageable pieces that it can then work through on the fly.