Reinforcement Learning
Reinforcement Learning and AI
Summary: At the core of modern AI, particularly robotics, and sequential tasks is Reinforcement Learning. Although RL has been around for many years it has become the third leg of the Machine Learning stool and increasingly important for Data Scientist to know when and how to implement. If you poled a group of data scientist just a few years back about how many machine learning problem types there are you would almost certainly have gotten a binary response: problem types were clearly divided into supervised and unsupervised. While Reinforcement Learning (RL) has been around since at least the 80's and before that in the behavioral sciences, its introduction as a major player in machine learning reflects it rising importance in AI. What problems fit this description?
Finding Career Opportunities in AI
Summary: Are there large, sustainable career opportunities in AI and if so where? Do they lie in the current technologies of Deep Learning and Reinforcement Learning or should you focus your career on the next wave of AI? If you're a data scientist thinking about expanding your career options into AI you've got a forest and trees problem. There's a lot going on in deep learning and reinforcement learning but do these areas hold the best future job prospects or do we need to be looking a little further forward? To try to answer that question we'll have to get out of the weeds of current development and get a higher level perspective about where this is all headed. The roots of AI are actually in the behavioral sciences migrating eventually into biology and neurology.
This is what the world's top StarCraft players think of a potential contest with advanced AI
Expectations for a match-up between a professional StarCraft player and sophisticated AI ratcheted up last year after an AI program beat a highly ranked human player at Go, one of the world's most difficult board games. Dave Churchill, an assistant professor of computer science at Memorial University of Newfoundland, who has run the AIIDE competition for the past six years, says the contest's AI bots generally play at a "low amateur" level and have never won against a proficient human player. Last November, DeepMind announced it would collaborate with StarCraft publisher Blizzard to create a free, open-source API tool to enable researchers to test AI algorithms in StarCraft II. Around the same time, Facebook's AI Research group described a reinforcement-learning algorithm it made for StarCraft and released its own free, open-source tools to help AI researchers link deep-learning algorithms to an early version of the game.
Ensemble Machine Learning in Python: Random Forest, AdaBoost
In recent years, we've seen a resurgence in AI, or artificial intelligence, and machine learning. Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts. Google's AlphaGo program was able to beat a world champion in the strategy game go using deep reinforcement learning. Machine learning is even being used to program self driving cars, which is going to change the automotive industry forever. Imagine a world with drastically reduced car accidents, simply by removing the element of human error.
Atari games and Intel processors
Adamski, Robert, Grel, Tomasz, Klimek, Maciej, Michalewski, Henryk
The asynchronous nature of the state-of-the-art reinforcement learning algorithms such as the Asynchronous Advantage Actor-Critic algorithm, makes them exceptionally suitable for CPU computations. However, given the fact that deep reinforcement learning often deals with interpreting visual information, a large part of the train and inference time is spent performing convolutions. In this work we present our results on learning strategies in Atari games using a Convolutional Neural Network, the Math Kernel Library and TensorFlow 0.11rc0 machine learning framework. We also analyze effects of asynchronous computations on the convergence of reinforcement learning algorithms.
5 EBooks to Read Before Getting into A Machine Learning Career
Note that, while there are numerous machine learning ebooks available for free online, including many which are very well-known, I have opted to move past these "regulars" and seek out lesser-known and more niche options for readers. The book has wide coverage of probabilistic machine learning, including discrete graphical models, Markov decision processes, latent variable models, Gaussian process, stochastic and deterministic inference, among others. The material is excellent for advanced undergraduate or introductory graduate course in graphical models, or probabilistic machine learning. One of these target audiences is university students(undergraduate or graduate) learning about machine learning, including those who are beginning a career in deep learning and artificial intelligence research.
Delving into adversarial attacks on deep policies
Adversarial examples have been shown to exist for a variety of deep learning architectures. Deep reinforcement learning has shown promising results on training agent policies directly on raw inputs such as image pixels. In this paper we present a novel study into adversarial attacks on deep reinforcement learning polices. We compare the effectiveness of the attacks using adversarial examples vs. random noise. We present a novel method for reducing the number of times adversarial examples need to be injected for a successful attack, based on the value function. We further explore how re-training on random noise and FGSM perturbations affects the resilience against adversarial examples.
Identification and Off-Policy Learning of Multiple Objectives Using Adaptive Clustering
Karimpanal, Thommen George, Wilhelm, Erik
In this work, we present a methodology that enables an agent to make efficient use of its exploratory actions by autonomously identifying possible objectives in its environment and learning them in parallel. The identification of objectives is achieved using an online and unsupervised adaptive clustering algorithm. The identified objectives are learned (at least partially) in parallel using Q-learning. Using a simulated agent and environment, it is shown that the converged or partially converged value function weights resulting from off-policy learning can be used to accumulate knowledge about multiple objectives without any additional exploration. We claim that the proposed approach could be useful in scenarios where the objectives are initially unknown or in real world scenarios where exploration is typically a time and energy intensive process. The implications and possible extensions of this work are also briefly discussed.
Enhancing Multi-Objective Reinforcement Learning with Concept Drift
Webber, Frederick Charles (United States Air Force Research Laboratory) | Peterson, Gilbert (Air Force Institute of Technology)
Reinforcement learning (RL) is a particular machine learning technique enabling an agent to learn while interacting with its environment. Agents in non-stationary environments are faced with the additional problem of handling concept drift, which is a partially-observable change that modifies the environment without notification. This causes several problems: agents with a decaying exploration fail to adapt while agents capable of adapting may over fit to noise and overwrites previously learned knowledge. These issues are known as the plasticity-stability dilemma and catastrophic forgetting, respectively. Agents in such environments must take steps to mitigate both problems. This work contributes an algorithm that combines a concept drift classifier with multi-objective reinforcement learning (MORL) to produce an unsupervised technique for learning in non-stationary environments, especially in the face of partially observable changes. The algorithm manages the plasticity-stability dilemma by strategically adjusting learning rates and mitigates catastrophic forgetting by systematically storing knowledge and recalling it when it recognizes repeat situations. Results demonstrate that agents using this algorithm outperform agents using an approach that ignores non-stationarity.