Reinforcement Learning


How Your Brain (and a Computer) Learn the 'Rules of the Game'

#artificialintelligence

In 1848, the 25-year-old Phineas Gage was working on a railroad in Vermont, packing explosive powder into a hole with an iron tamper. Unexpectedly, the powder exploded, sending the tamper backwards through Gage's skull and brain. That he survived is a miracle, but astonishingly he even seemed capable of functioning effectively, maintaining normal memory, speech, and motor skills. Those that knew him, however, thought he was anything but the same, with friends remarking he was "no longer Gage." "…his equilibrium, or balance, so to speak, between his intellectual faculties and animal propensities seems to have been destroyed.



[D] What are some non-trivial but achievable exercises in reinforcement learning? • r/MachineLearning

@machinelearnbot

It's a lot more involved, but the Easy21 assignment from David Silver's RL course is a very thorough intro to RL - it does require going through requisite material in the course though. This is in opposition to putting a cross-entropy loss on a CNN and training it on MNIST - courses/practicals will vary on how practically they deal with this task vs. how much they go into the underlying theory.


Data Science: Machine Learning algorithms in Matlab

@machinelearnbot

In recent years, we've seen a resurgence in AI, or artificial intelligence, and machine learning. Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts. Google's AlphaGo program was able to beat a world champion in the strategy game go using deep reinforcement learning. Machine learning is even being used to program self driving cars, which is going to change the automotive industry forever. Imagine a world with drastically reduced car accidents, simply by removing the element of human error.



An introduction to Policy Gradients with Cartpole and Doom

#artificialintelligence

In the last two articles about Q-learning and Deep Q learning, we worked with value-based reinforcement learning algorithms. To choose which action to take given a state, we take the action with the highest Q-value (maximum expected future reward I will get at each state). As a consequence, in value-based learning, a policy exists only because of these action-value estimates. Today, we'll learn a policy-based reinforcement learning technique called Policy Gradients. The first will learn to keep the bar in balance.


Hands - On Reinforcement Learning with Python Udemy

@machinelearnbot

Reinforcement learning (RL) is hot! It allows programmers to create software agents that learn to take optimal actions to maximize reward, through trying out different strategies in a given environment. This course will take you through all the core concepts in Reinforcement Learning, transforming a theoretical subject into tangible Python coding exercises with the help of OpenAI Gym. The videos will first guide you through the gym environment, solving the CartPole-v0 toy robotics problem, before moving on to coding up and solving a multi-armed bandit problem in Python. As the course ramps up, it shows you how to use dynamic programming and TensorFlow-based neural networks to solve GridWorld, another OpenAI Gym challenge.


TensorFlow and deep reinforcement learning, without a PhD (Google I/O '18)

#artificialintelligence

On the forefront of deep learning research is a technique called reinforcement learning, which bridges the gap between academic deep learning problems and ways in which learning occurs in nature in weakly supervised environments. This technique is heavily used when researching areas like learning how to walk, chase prey, navigate complex environments, and even play Go. This session will teach a neural network to play the video game Pong from just the pixels on the screen. No rules, no strategy coaching, and no PhD required. See all the sessions from Google I/O '18 here https://goo.gl/q1Tr8x


Prefrontal cortex as a meta-reinforcement learning system DeepMind

#artificialintelligence

In our new paper in Nature Neuroscience, we use the meta-reinforcement learning framework developed in AI research to investigate the role of dopamine in the brain in helping us to learn. Dopamine--commonly known as the brain's pleasure signal--has often been thought of as analogous to the reward prediction error signal used in AI reinforcement learning algorithms. These systems learn to act by trial and error guided by the reward. We propose that dopamine's role goes beyond just using reward to learn the value of past actions and that it plays an integral role, specifically within the prefrontal cortex area, in allowing us to learn efficiently, rapidly and flexibly on new tasks. We tested our theory by virtually recreating six meta-learning experiments from the field of neuroscience--each requiring an agent to perform tasks that use the same underlying principles (or set of skills) but that vary in some dimension.


Data Science: Supervised Machine Learning in Python

@machinelearnbot

In recent years, we've seen a resurgence in AI, or artificial intelligence, and machine learning. Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts. Google's AlphaGo program was able to beat a world champion in the strategy game go using deep reinforcement learning. Machine learning is even being used to program self driving cars, which is going to change the automotive industry forever. Imagine a world with drastically reduced car accidents, simply by removing the element of human error.