Reinforcement Learning


Under the Hood with Reinforcement Learning – Understanding Basic RL Models

@machinelearnbot

Summary: Reinforcement Learning (RL) is likely to be the next big push in artificial intelligence. It's the core technique for robotics, smart IoT, game play, and many other emerging areas. But the concept of modeling in RL is very different from our statistical techniques and deep learning. In this two part series we'll take a look at the basics of RL models, how they're built and used. In the next part, we'll address some of the complexities that make development a challenge.


Deep Reinforcement Learning Models: Tips & Tricks for Writing Reward Functions

#artificialintelligence

In this post, I'm going to cover tricks and best practices for how to write the most effective reward functions for reinforcement learning models. If you're unfamiliar with deep reinforcement learning, you can learn more about it here before jumping into the post below. Crafting reward functions for reinforcement learning models is not easy. It's not easy for the same reason that crafting incentive plans for employees is not easy. We get things affectionately known as the cobra effect.


Playing the Beer Game Using Reinforcement Learning

@machinelearnbot

The beer game is a widely used in-class game that is played in supply chain management classes to demonstrate a phenomenon known as the bullwhip effect. The game consists of a serial supply chain network with four players--a retailer, a wholesaler, a distributor, and a manufacturer. In each period of the game, the retailer experiences a random demand from customers. Then the four players each decide how much inventory of "beer" to order. The retailer orders from the wholesaler, the wholesaler orders from the distributor, the distributor from the manufacturer, and the manufacturer orders from an external supplier that is not a player in the game.


Lecture 14 Deep Reinforcement Learning

#artificialintelligence

In Lecture 14 we move from supervised learning to reinforcement learning (RL), in which an agent must learn to interact with an environment in order to maximize its reward. We discuss different algorithms for reinforcement learning including Q-Learning, policy gradients, and Actor-Critic. We show how deep reinforcement learning has been used to play Atari games and to achieve super-human Go performance in AlphaGo. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka "deep learning") approaches have greatly advanced the performance of these state-of-the-art visual recognition systems.


REINFORCEjs: Gridworld with Dynamic Programming

#artificialintelligence

Temporal Difference Learning Gridworld Demo // agent parameter spec to play with (this gets eval()'d on Agent reset) var spec {} spec.update This is a toy environment called **Gridworld** that is often used as a toy model in the Reinforcement Learning literature. In this particular case: - **State space**: GridWorld has 10x10 100 distinct states. The start state is the top left cell. The gray cells are walls and cannot be moved to.


Back to the core of intelligence … to really move to the future

#artificialintelligence

Two decades ago I started working on metrics of machine intelligence. By that time, during the glacial days of the second AI winter, few were really interested in measuring something that AI lacked completely. And very few, such as David L. Dowe and I, were interested in metrics of intelligence linked to algorithmic information theory, where the models of interaction between an agent and the world were sequences of bits, and intelligence was formulated using Solomonoff's and Wallace's theories of inductive inference. In the meantime, seemingly dozens of variants of the Turing test were proposed every year, the CAPTCHAs were introduced and David showed how easy it is to solve some IQ tests using a very simple program based on a big-switch approach. And, today, a new AI spring has arrived, triggered by a blossoming machine learning field, bringing a more experimental approach to AI with an increasing number of AI benchmarks and competitions (see a previous entry in this blog for a survey).


Berkeley startup to train robots like puppets

@machinelearnbot

Robots today must be programmed by writing computer code, but imagine donning a VR headset and virtually guiding a robot through a task, like you would move the arms of a puppet, and then letting the robot take it from there. That's the vision of Pieter Abbeel, a professor of electrical engineering and computer science at the University of California, Berkeley, and his students, Peter Chen, Rocky Duan and Tianhao Zhang, who have launched a startup, Embodied Intelligence Inc., to use the latest techniques of deep reinforcement learning and artificial intelligence to make industrial robots easily teachable. "Right now, if you want to set up a robot, you program that robot to do what you want it to do, which takes a lot of time and a lot of expertise," said Abbeel, who is currently on leave to turn his vision into reality. "With our advances in machine learning, we can write a piece of software once -- machine learning code that enables the robot to learn -- and then when the robot needs to be equipped with a new skill, we simply provide new data." The "data" is training, much like you'd train a human worker, though with the added dimension of virtual reality.


1107_release

#artificialintelligence

Building on the founders' pioneering research in deep imitation learning, deep reinforcement learning and meta-learning, Embodied Intelligence is developing AI software (aka robot brains) that can be loaded onto any existing robots. While traditional programming of robots requires writing code, a time-consuming endeavor even for robotics experts, Embodied Intelligence software will empower anyone to program a robot by simply donning a VR headset and guiding a robot through a task. These human demonstrations train deep neural nets, which are further tuned through the use of reinforcement learning, resulting in robots that can be easily taught a wide range of skills in areas where existing solutions break down. Complicated tasks like the manipulation of deformable objects such as wires, fabrics, linens, apparel, fluid-bags, and food; picking parts and order items out of cluttered, unstructured bins; completing assemblies where hard automation struggles due to variability in parts, configurations, and individualization of orders, are all candidates to benefit from Embodied Intelligence's work.


Deep reinforcement learning: where to start – freeCodeCamp

#artificialintelligence

More than 200 million people watched as reinforcement learning (RL) took to the world stage. A few years earlier, DeepMind had made waves with a bot that could play Atari games. The company was soon acquired by Google. Many researchers believe that RL is our best shot at creating artificial general intelligence. It is an exciting field, with many unsolved challenges and huge potential.


Learning in Brains and Machines (3): Synergistic and Modular Action

@machinelearnbot

There is a dance--precisely choreographed and executed--that we perform throughout our lives. This is the dance formed by our movements. Our movements are our actions and the final outcome of our decision making processes. Single actions are built into reusable sequences, sequences are composed into complex routines, routines are arranged into elegant choreographies, and so the complexity of human action is realised. This synergy, the composition of actions into increasingly complex units, suggests the desirability of a modular and hierarchical approach to the selection and execution of actions.