Goto

Collaborating Authors

 play super mario bro


3D-printed robotic hand powered by water can play Super Mario Bros

New Scientist

A 3D-printed robotic hand controlled by pressurised water can complete the first level of classic computer game Super Mario Bros in less than 90 seconds. Ryan Sochol and his team at the University of Maryland were able to 3D print the hand in a single operation using a machine that can deposit hard plastic, a rubber-like polymer and a water-soluble "sacrificial" material.


Using Reinforcement Learning to play Super Mario Bros on NES using TensorFlow

#artificialintelligence

Reinforcement learning is currently one of the hottest topics in machine learning. For a recent conference we attended (the awesome Data Festival in Munich), we've developed a reinforcement learning model that learns to play Super Mario Bros on NES so that visitors, that come to our booth, can compete against the agent in terms of level completion time. The promotion was a great success and people enjoyed the „human vs. machine" competition. There was only one contestant who was able to beat the AI by taking a secret shortcut, that the AI wasn't aware of. Also, developing the model in Python was a lot of fun. So, I decided to write a blog post about it that covers some of the fundamental concepts of reinforcement learning as well as the actual implementation of our Super Mario agent in TensorFlow (beware, I've used TensorFlow 1.13.1, TensorFlow 2.0 was not released at the time of writing this article). Most machine learning models have an explicit connection between inputs and outputs that does not change during training time. Therefore, it can be difficult to model or predict systems, where the inputs or targets themselves depend on previous predictions. However, often, the world around the model updates itself with every prediction made. What sounds quite abstract is actually a very common situation in the real world: autonomous driving, machine control, process automation etc. -- in many situations, decisions that are made by models have an impact on their surroundings and consequently on the next actions to be taken. Classical supervised learning approaches can only be used to a limited extend in such kinds of situations. To solve the latter, machine learning models are needed that are able to cope with time-dependent variation of inputs and outputs that are interdependent. This is where reinforcement learning comes into play. In reinforcement learning, the model (called agent) interacts with its environment by choosing from a set of possible actions (action space) in each state of the environment that cause either positive or negative rewards from the environment. Think of rewards as an abstract concept of signalizing that the action taken was good or bad. Thereby, the reward issued by the environment can be immediate or delayed into the future. By learning from the combination of environment states, actions and corresponding rewards (so called transitions), the agent tries to reach an optimal set of decision rules (the policy) that maximize the total reward gathered by the agent in each state. In reinforcement learning we often use a learning concept called Q-learning. Q-learning is based on so called Q-values, that help the agent determining the optimal action, given the current state of the environment. Q-values are „discounted" future rewards, that our agent collects during training by taking actions and moving through the different states of the environment.


This live-stream of AI learning to play Super Mario Bros is awesome

#artificialintelligence

Einfach nerdig, a Youtuber with currently only one video up, started a livestream of an AI learning to play "Super Mario Bros." 4 days ago. It's still running, and watching it is amazing. The AI, MarI/O, comes courtesy of creator SethBling, who despite his own huge following, isn't the one streaming the training session. The account streaming the video has disabled embedding, but you can watch it learn to play the game here on YouTube. SethBling, a world-record holder for "Super Mario World" speedruns, previously trained the MariI/O AI to play "Super Mario World" by feeding it footage of his own gameplay.


Clever Machines Learn How to Be Curious (And Play Super Mario Bros.)

WIRED

You probably can't remember what it feels like to play Super Mario Bros. for the very first time, but try to picture it. An 8-bit game world blinks into being: baby blue sky, tessellated stone ground, and in between, a squat, red-suited man standing still--waiting. He's facing rightward; you nudge him farther in that direction. A few more steps reveal a row of bricks hovering overhead and what looks like an angry, ambulatory mushroom. Another twitch of the game controls makes the man spring up, his four-pixel fist pointed skyward. Maybe try combining nudge-rightward and spring-skyward?


Teaching Your Computer To Play Super Mario Bros. – A Fork of the Google DeepMind Atari Machine Learning Project

#artificialintelligence

The second issue I noticed was that there seemed to be little connection between the network's confidence in its actions and its actual score. I came across another recent paper on something called Double Q Learning, also courtesy of DeepMind, which substantially improved Google's original results. Double Q Learning counters the tendency for Q networks to become overconfident in their predictions. I changed Google's original Deep Q Network to a Double Deep Q Network, and that helped substantially. Finally, the biggest improvement of all came when I was just more patient. Even running on a powerful machine with a Nvidia 980 GPU, the emulator could only go so fast. As a consequence, one million training steps took about an entire day, with quite a bit of variance in the scores along the way.