Microsoft researchers have created an artificial intelligence-based system that learned how to get the maximum score on the addictive 1980s video game Ms. Pac-Man, using a divide-and-conquer method that could have broad implications for teaching AI agents to do complex tasks that augment human capabilities. The team from Maluuba, a Canadian deep learning startup acquired by Microsoft earlier this year, used a branch of AI called reinforcement learning to play the Atari 2600 version of Ms. Pac-Man perfectly. Using that method, the team achieved the maximum score possible of 999,990. Doina Precup, an associate professor of computer science at McGill University in Montreal said that's a significant achievement among AI researchers, who have been using various videogames to test their systems but have found Ms. Pac-Man among the most difficult to crack. But Precup said she was impressed not just with what the researchers achieved but with how they achieved it.
Min, Wookhee (North Carolina State University) | Baikadi, Alok (University of Pittsburgh) | Mott, Bradford (North Carolina State University) | Rowe, Jonathan (North Carolina State University) | Liu, Barry (North Carolina State University) | Ha, Eun Young (IBM) | Lester, James (North Carolina State University)
Recent years have seen a growing interest in player modeling, which supports the creation of player-adaptive digital games. A central problem of player modeling is goal recognition, which aims to recognize players’ intentions from observable gameplay behaviors. Player goal recognition offers the promise of enabling games to dynamically adjust challenge levels, perform procedural content generation, and create believable NPC interactions. A growing body of work is investigating a wide range of machine learning-based goal recognition models. In this paper, we introduce GOALIE, a multidimensional framework for evaluating player goal recognition models. The framework integrates multiple metrics for player goal recognition models, including two novel metrics, n-early convergence rate and standardized convergence point . We demonstrate the application of the GOALIE framework with the evaluation of several player goal recognition models, including Markov logic network-based, deep feedforward neural network-based, and long short-term memory network-based goal recognizers on two different educational games. The results suggest that GOALIE effectively captures goal recognition behaviors that are key to next-generation player modeling.
This paper presents a deep learning framework that is capable of solving partially observable locomotion tasks based on our novel Recurrent Deterministic Policy Gradient (RDPG). Three major improvements are applied in our RDPG based learning framework: asynchronized backup of interpolated temporal difference, initialisation of hidden state using past trajectory scanning, and injection of external experiences learned by other agents. The proposed learning framework was implemented to solve the Bipedal-Walker challenge in OpenAI's gym simulation environment where only partial state information is available. Our simulation study shows that the autonomous behaviors generated by the RDPG agent are highly adaptive to a variety of obstacles and enables the agent to traverse rugged terrains effectively.
Understanding player behavior is fundamental in game data science. Video games evolve as players interact with the game, so being able to foresee player experience would help to ensure a successful game development. In particular, game developers need to evaluate beforehand the impact of in-game events. Simulation optimization of these events is crucial to increase player engagement and maximize monetization. We present an experimental analysis of several methods to forecast game-related variables, with two main aims: to obtain accurate predictions of in-app purchases and playtime in an operational production environment, and to perform simulations of in-game events in order to maximize sales and playtime. Our ultimate purpose is to take a step towards the data-driven development of games. The results suggest that, even though the performance of traditional approaches such as ARIMA is still better, the outcomes of state-of-the-art techniques like deep learning are promising. Deep learning comes up as a well-suited general model that could be used to forecast a variety of time series with different dynamic behaviors.
Since its invention by a Hungarian architect in 1974, the Rubik's Cube has furrowed the brows of many who have tried to solve it, but the 3-D logic puzzle is no match for an artificial intelligence system created by researchers at the University of California, Irvine. DeepCubeA, a deep reinforcement learning algorithm programmed by UCI computer scientists and mathematicians, can find the solution in a fraction of a second, without any specific domain knowledge or in-game coaching from humans. This is no simple task considering that the cube has completion paths numbering in the billions but only one goal state--each of six sides displaying a solid color--which apparently can't be found through random moves. For a study published today in Nature Machine Intelligence, the researchers demonstrated that DeepCubeA solved 100 percent of all test configurations, finding the shortest path to the goal state about 60 percent of the time. The algorithm also works on other combinatorial games such as the sliding tile puzzle, Lights Out and Sokoban.