In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming. We give a fairly comprehensive catalog of learning problems, describe the core ideas, note a large number of state of the art algorithms, followed by the discussion of their theoretical properties and limitations. ISBN 9781608454921, 103 pages.
You will then learn how to implement these in pythonic and concise PyTorch code, that can be extended to include any future deep Q learning algorithms. These algorithms will be used to solve a variety of environments from the Open AI gym's Atari library, including Pong, Breakout, and Bankheist. You will learn the key to making these Deep Q Learning algorithms work, which is how to modify the Open AI Gym's Atari library to meet the specifications of the original Deep Q Learning papers. Also included is a mini course in deep learning using the PyTorch framework. This is geared for students who are familiar with the basic concepts of deep learning, but not the specifics, or those who are comfortable with deep learning in another framework, such as Tensorflow or Keras.
For an autonomous agent to fulfill a wide range of user-specified goals at test time, it must be able to learn broadly applicable and general-purpose skill repertoires. Furthermore, to provide the requisite level of generality, these skills must handle raw sensory input such as images. In this paper, we propose an algorithm that acquires such general-purpose skills by combining unsupervised representation learning and reinforcement learning of goal-conditioned policies. Since the particular goals that might be required at test-time are not known in advance, the agent performs a self-supervised "practice" phase where it imagines goals and attempts to achieve them. We learn a visual representation with three distinct purposes: sampling goals for self-supervised practice, providing a structured transformation of raw sensory inputs, and computing a reward signal for goal reaching.
Reinforcement Learning (RL) is an area of machine learning, where an agent learns by interacting with its environment to achieve a goal.In this course, you will be introduced to the world of reinforcement learning. You will learn how to frame reinforcement learning problems and start tackling classic examples like news recommendation, learning to navigate in a grid-world, and balancing a cart-pole. You will explore the basic algorithms from multi-armed bandits, dynamic programming, TD (temporal difference) learning, and progress towards larger state space using function approximation, in particular using deep learning. You will also learn about algorithms that focus on searching the best policy with policy gradient and actor critic methods. Along the way, you will get introduced to Project Malmo, a platform for Artificial Intelligence experimentation and research built on top of the Minecraft game.
So far we assumed that all the data is given to us and that we don't really have much of a choice of what we do. For instance, when Google wants to display an ad for the query'cat' it has thousands of possible ads at its disposition, ranging from kitty litter to catsuits. Which ad gets the most clicks depends very much on the ad, the user, and the context. Hence it must try out different versions to determine which ones are best. Doing this brute force is very expensive (nobody wants to see too many catsuit ads).