Collaborating Authors

Schooling Flappy Bird: A Reinforcement Learning Tutorial


In classical programming, software instructions are explicitly made by programmers and nothing is learned from the data at all. In contrast, machine learning is a field of computer science which uses statistical methods to enable computers to learn and to extract knowledge from the data without being explicitly programmed. In this reinforcement learning tutorial, I'll show how we can use PyTorch to teach a reinforcement learning neural network how to play Flappy Bird. But first, we'll need to cover a number of building blocks. Machine learning algorithms can roughly be divided into two parts: Traditional learning algorithms and deep learning algorithms. Traditional learning algorithms usually have much fewer learnable parameters than deep learning algorithms and have much less learning capacity. Also, traditional learning algorithms are not able to do feature extraction: Artificial intelligence specialists need to figure out a good data representation which is then sent to the learning algorithm. Examples of traditional machine learning techniques include SVM, random forest, decision tree, and $k$-means, whereas the central algorithm in deep learning is the deep neural network.

Hyp-RL : Hyperparameter Optimization by Reinforcement Learning Machine Learning

Hyperparameter tuning is an omnipresent problem in machine learning as it is an integral aspect of obtaining the state-of-the-art performance for any model. Most often, hyperparameters are optimized just by training a model on a grid of possible hyperparameter values and taking the one that performs best on a validation sample (grid search). More recently, methods have been introduced that build a so-called surrogate model that predicts the validation loss for a specific hyperparameter setting, model and dataset and then sequentially select the next hyperparameter to test, based on a heuristic function of the expected value and the uncertainty of the surrogate model called acquisition function (sequential model-based Bayesian optimization, SMBO). In this paper we model the hyperparameter optimization problem as a sequential decision problem, which hyperparameter to test next, and address it with reinforcement learning. This way our model does not have to rely on a heuristic acquisition function like SMBO, but can learn which hyperparameters to test next based on the subsequent reduction in validation loss they will eventually lead to, either because they yield good models themselves or because they allow the hyperparameter selection policy to build a better surrogate model that is able to choose better hyperparameters later on. Experiments on a large battery of 50 data sets demonstrate that our method outperforms the state-of-the-art approaches for hyperparameter learning.

Deep Reinforcement Learning Artificial Intelligence

Deep reinforcement learning has gathered much attention recently. Impressive results were achieved in activities as diverse as autonomous driving, game playing, molecular recombination, and robotics. In all these fields, computer programs have taught themselves to solve difficult problems. They have learned to fly model helicopters and perform aerobatic manoeuvers such as loops and rolls. In some applications they have even become better than the best humans, such as in Atari, Go, poker and StarCraft. The way in which deep reinforcement learning explores complex environments reminds us of how children learn, by playfully trying out things, getting feedback, and trying again. The computer seems to truly possess aspects of human learning; this goes to the heart of the dream of artificial intelligence. The successes in research have not gone unnoticed by educators, and universities have started to offer courses on the subject. The aim of this book is to provide a comprehensive overview of the field of deep reinforcement learning. The book is written for graduate students of artificial intelligence, and for researchers and practitioners who wish to better understand deep reinforcement learning methods and their challenges. We assume an undergraduate-level of understanding of computer science and artificial intelligence; the programming language of this book is Python. We describe the foundations, the algorithms and the applications of deep reinforcement learning. We cover the established model-free and model-based methods that form the basis of the field. Developments go quickly, and we also cover advanced topics: deep multi-agent reinforcement learning, deep hierarchical reinforcement learning, and deep meta learning.

Machine Learning Basics


Before we start this article on machine learning basics, let us take an example to understand the impact of machine learning in the world. We can safely assume that machine learning has been a dominant force in today's world and has accelerated our progress in all fields. No matter which industry you look at, machine learning has dramatically altered it. Let's take an example from the world of trading. Man Group's AHL Dimension programme is a $5.1 billion dollar hedge fund which is partially managed by AI. After it started off, by the year 2015, its machine learning algorithms were contributing more than half of the profits of the fund even though the assets under its management were far less. Machine learning has become a hot topic today, with professionals all over the world signing up for ML or AI courses for fear of being left behind. But exactly what is machine learning? It will be clear to you when you have reached the end of this article. Machine Learning, as the name suggests, provides machines with the ability to learn autonomously based on experiences, observations and analysing patterns within a given data set without explicitly programming. When we write a program or a code for some specific purpose, we are actually writing a definite set of instructions which the machine will follow. Whereas in machine learning, we input a data set through which the machine will learn by identifying and analysing the patterns in the data set and learn to take decisions autonomously based on its observations and learnings from the dataset.