Reinforcement Learning


Deep Q Learning is Simple with Keras Tutorial

#artificialintelligence

In this tutorial you'll code up a simple Deep Q Network in Keras to beat the Lunar Lander environment from the Open AI Gym. It's only 150 lines of code, and Keras makes it incredibly simple to do.


OpenAI releases Safety Gym for reinforcement learning

#artificialintelligence

While much work in data science to date has focused on algorithmic scale and sophistication, safety -- that is, safeguards against harm -- is a domain no less worth pursuing. This is particularly true in applications like self-driving vehicles, where a machine learning system's poor judgement might contribute to an accident. That's why firms like Intel's Mobileye and Nvidia have proposed frameworks to guarantee safe and logical decision-making, and it's why OpenAI -- the San Francisco-based research firm cofounded by CTO Greg Brockman, chief scientist Ilya Sutskever, and others -- today released Safety Gym. OpenAI describes it as a suite of tools for developing AI that respects safety constraints while training, and for comparing the "safety" of algorithms and the extent to which those algorithms avoid mistakes while learning. Safety Gym is designed for reinforcement learning agents, or AI that's progressively spurred toward goals via rewards (or punishments).


The Reinforcement-Learning Methods that Allow AlphaStar to Outcompete Almost All Human Players at StarCraft II - KDnuggets

#artificialintelligence

In January, artificial intelligence(AI) powerhouse DeepMind announced it had achieved a major milestone in its journey towards building AI systems that resemble human cognition. AlphaStar was a DeepMind agent designed using reinforcement learning that was able to beat two professional players at a game of StarCraft II, one of the most complex real-time strategy games of all time. During the last few months, DeepMind continued evolving AlphaStar to the point that the AI agent is now able to play a full game of StarCraft II at a Grandmaster level outranking 99.8% of human players. The results were recently published in Nature and they show some of the most advanced self-learning techniques used in modern AI systems. DeepMind's milestone is better explained by illustrating the trajectory from the first version of AlphaStar to the current one as well as some of the key challenges of StarCraft II.


DeepMind on Twitter

#artificialintelligence

Our new @nature paper: AlphaStar is the first learning system to reach the top tier of a major esport without any game restrictions, achieving Grandmaster status in StarCraft II. Researchers have been working on the StarCraft series for over 15 years.


Algorithms that learn to solve tasks by watching 1 Youtube video by Samiran & Shibsankar #ODSC_India

#artificialintelligence

Two branches of AI - Deep Learning, and Reinforcement Learning are now responsible for many real-world applications. Machine Translation, Speech Recognition, Object Detection, Robot Control, and Drug Discovery - are some of the numerous examples. Both approaches are data hungry - DL requires many examples of each class, and RL needs to play through many episodes to learn a policy. A small child can typically see an image just once, and instantly recognize it in other contexts and environments. We seem to possess an innate model/representation of how the world works, which helps us grasp new concepts and adapt to new situations fast.


What is Reinforcement Learning? AI 101

#artificialintelligence

Thank you to Jeff, Gerald, Milan, Ian, Becky, Jino, Daniel, Narskogr, Jason, and Mariano for being $5 /month Patrons! Follow me on Twitter! http://twitter.com/jordanbharrod


New robotic arm at University of Alberta to help students better understand artificial intelligence

#artificialintelligence

Students at the University of Alberta are getting hands-on experience with artificial intelligence with a new robotic arm. Donated to the university's department of computing science by Kindred AI, a Canadian-based artificial intelligence company, the use of the robotic arm in the classroom helps students get a sense of reinforcement learning. Reinforcement learning is a branch of artificial intelligence, says Rapum Mahmood, assistant professor at the U of A and former Kindred AI research lead. "In reinforcement learning, we study by letting the agent interact with the environment, so that it can take the right set of actions," said Mahmood. Usually, the study is done through computer simulations and board games but in real-world applications, a robotic arm is used.


Beyond DQN/A3C: A Survey in Advanced Reinforcement Learning

#artificialintelligence

One of my favorite things about deep reinforcement learning is that, unlike supervised learning, it really, really doesn't want to work. Throwing a neural net at a computer vision problem might get you 80% of the way there. Throwing a neural net at an RL problem will probably blow something up in front of your face -- and it will blow up in a different way each time you try. A lot of the biggest challenges in RL revolve around two questions: how we interact with the environment effectively (e.g. In this post, I want to explore a few recent directions in deep RL research that attempt to address these challenges, and do so with particularly elegant parallels to human cognition.


A Reinforcement-Learning-Based Distributed Resource Selection Algorithm for Massive IoT

#artificialintelligence

Massive IoT including the large number of resource-constrained IoT devices has gained great attention. IoT devices generate enormous traffic, which causes network congestion. To manage network congestion, multi-channel-based algorithms are proposed. However, most of the existing multi-channel algorithms require strict synchronization, an extra overhead for negotiating channel assignment, which poses significant challenges to resource-constrained IoT devices. In this paper, a distributed channel selection algorithm utilizing the tug-of-war (TOW) dynamics is proposed for improving successful frame delivery of the whole network by letting IoT devices always select suitable channels for communication adaptively.


Asynchronous Methods for Deep Reinforcement Learning

#artificialintelligence

We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers.