Reinforcement Learning
5 Ways to Get Started with Reinforcement Learning
Machine learning algorithms, and neural networks in particular, are considered to be the cause of a new AI'revolution'. In this article I will introduce the concept of reinforcement learning but with limited technical details so that readers with a variety of backgrounds can understand the essence of the technique, its capabilities and limitations. At the end of the article, I will provide links to a few resources for implementing RL. Broadly speaking, data-driven algorithms can be categorized into three types: Supervised, Unsupervised, and Reinforcement learning. The first two are generally used to perform tasks such as image classification, detection, etc.
Machine Learning Is Making Video Game Characters Smarter And Robots More Competent
So says Danny Lange, the VP of AI and machine learning at Unity Technologies, a major maker of game "engine" software that handles the underlying mechanics of titles like Firewatch and ChronoBlade. Today the company announced Unity Machine Learning Agents--open-source software linking its game engine to machine learning programs such as Google's TensorFlow. It will allow non-playable characters, through trial and error, to develop better, more creative strategies than a human could program, says Lange, using a branch of machine learning called deep reinforcement learning. Unity's new AI-linking tool isn't confined to virtual characters. The software can also speed up the development of real-life robots, like self-driving cars, says Lange, by training them relentlessly in sprawling, computer-generated--but lifelike--virtual landscapes.
Flipboard on Flipboard
For years, video game developers have used artificial intelligence to animate those characters encountered by a player, but non-playable characters, or NPCs, have been based on sets of rules coded by humans. Using the AI technology du jour, machine learning, future NPCs will program and reprogram their own rules, based on the experiences they encounter in games, in the process getting smarter the longer they play. So says Danny Lange, the VP of AI and machine learning at Unity Technologies, a major maker of game "engine" software that handles the underlying mechanics of titles like Firewatch and ChronoBlade. Today the company announced Unity Machine Learning Agents--open-source software linking its game engine to machine learning programs such as Google's TensorFlow. It will allow non-playable characters, through trial and error, to develop better, more creative strategies than a human could program, says Lange, using a branch of machine learning called deep reinforcement learning.
AI Startup Invents Trick For Robots To More Efficiently Teach Themselves Complex Tasks
Google-owned DeepMind uses sophisticated computer simulations for computers to teach themselves how to accomplish certain tasks. The simulated training, known as reinforcement learning, involves the computer trying out thousands (or millions) of different things until it manages to figure out what to do. Using this approach combined with deep learning, the London-based artificial intelligence research unit is teaching computers how to beat the world's best Go players and training robots how to move around in the world. A tiny Berkeley, California-based AI startup, Bonsai, has invented a trick to beat DeepMind in this game. The trick -- the company is calling it "concept networks" -- massively increases the efficiency of reinforcement learning.
Facebook heads to Canada in search of the next big AI advance
Several leading figures in AI, including LeCun, have studied or taught at Canadian universities. Reinforcement learning builds on deep learning to let machines learn through experimentation. Michael Bowling, a U.S.-born computer scientist who leads a lab at the University of Alberta that has produced cutting-edge poker-playing machines, says the new Facebook lab simply shows that Canada already leads the rest of the world in AI. Indeed, after seeing AI researchers snapped up by big U.S. companies in recent years, Canada may well hope that the environment fostered by new labs, including the one in Montreal, will eventually produce companies that rival the likes of Facebook.
Facebook's New Lab Bolsters Montreal's Bragging Rights As An AI Hub
On Friday, the social networking giant is announcing the opening of a new AI research lab, its fourth, in the Canadian city. Led by Joelle Pineau, an expert in the areas of dialogue systems and reinforcement learning, and a professor at McGill University in Montreal, the lab is expected to grow from an initial team of 10–including interns–to about 30 within a couple of years. Among those behind Montreal's emergence as a leader in AI research is University of Montreal professor and director of the school's Montreal Institute for Learning Algorithms Yoshua Bengio, a pioneer in deep learning. "Facebook is clearly a leader in AI," Bengio said in a statement, "and the creation [of] Facebook's AI lab here is going to contribute to the expansion of Montreal as an international hub for AI, an ecosystem joining universities [and] established companies as well as startups." Pineau, who plans on maintaining her affiliation with McGill–and who will split her time evenly between the university and Facebook–said her team's mandate is to develop the next generation of artificial intelligence technology, particularly in the areas of computer vision, natural language, and video analysis.
Guided Deep Reinforcement Learning for Swarm Systems
Hüttenrauch, Maximilian, Šošić, Adrian, Neumann, Gerhard
In this paper, we investigate how to learn to control a group of cooperative agents with limited sensing capabilities such as robot swarms. The agents have only very basic sensor capabilities, yet in a group they can accomplish sophisticated tasks, such as distributed assembly or search and rescue tasks. Learning a policy for a group of agents is difficult due to distributed partial observability of the state. Here, we follow a guided approach where a critic has central access to the global state during learning, which simplifies the policy evaluation problem from a reinforcement learning point of view. For example, we can get the positions of all robots of the swarm using a camera image of a scene. This camera image is only available to the critic and not to the control policies of the robots. We follow an actor-critic approach, where the actors base their decisions only on locally sensed information. In contrast, the critic is learned based on the true global state. Our algorithm uses deep reinforcement learning to approximate both the Q-function and the policy. The performance of the algorithm is evaluated on two tasks with simple simulated 2D agents: 1) finding and maintaining a certain distance to each others and 2) locating a target.
Why Pay More When You Can Pay Less: A Joint Learning Framework for Active Feature Acquisition and Classification
Shim, Hajin, Hwang, Sung Ju, Yang, Eunho
We consider the problem of active feature acquisition, where we sequentially select the subset of features in order to achieve the maximum prediction performance in the most cost-effective way. In this work, we formulate this active feature acquisition problem as a reinforcement learning problem, and provide a novel framework for jointly learning both the RL agent and the classifier (environment). We also introduce a more systematic way of encoding subsets of features that can properly handle innate challenge with missing entries in active feature acquisition problems, that uses the orderless LSTM-based set encoding mechanism that readily fits in the joint learning framework. We evaluate our model on a carefully designed synthetic dataset for the active feature acquisition as well as several real datasets such as electric health record (EHR) datasets, on which it outperforms all baselines in terms of prediction performance as well feature acquisition cost.
Embodied Artificial Intelligence through Distributed Adaptive Control: An Integrated Framework
Moulin-Frier, Clément, Puigbò, Jordi-Ysard, Arsiwalla, Xerxes D., Sanchez-Fibla, Martì, Verschure, Paul F. M. J.
In recent years, research in Artificial Intelligence has been primarily dominated by impressive advances in Machine Learning, with a strong emphasis on the so-called Deep Learning framework. It has allowed considerable achievements such as human-level performance in visual classification [1] and description [2], in Atari video games [3] and even in the highly complex game of Go [4]. The Deep Learning approach is characterized by supposing very minimal prior on the task to be solved, compensating this lack of prior knowledge by feeding the learning algorithm with an extremely high amount of training data, while hiding the intermediary representations. However, it is important noting that the most important contributions of Deep Learning for Artificial Intelligence often owe their success in part to their integration with other types of learning algorithms. For example, the AlphaGo program which defeated the world champions in the famously complex game of Go [4], is based on the integration of Deep Reinforcement Learning with a Monte-Carlo tree search algorithm. Without the tree search addition, AlphaGo still outperforms previous machine performances but is unable to beat high-level human players. Another example can be found in the original Deep Q-Learning algorithm (DQN, Mnih et al., 2015), achieving very poor performance in some Atari games where the reward is considerably sparse and delayed (e.g.
Using machine learning for signal processing/classification • r/MachineLearning
This semester I'll be writing my specialization project thesis, which will be about mind controlled drones. Most of the project will be processing and analyzing the signals from the sensor headset, and classifying these signals so that I can use them to control the drone. I am considering trying something a bit different than what people have been doing before on this project: I want to use reinforcement learning to teach the drone to fly the way I want. I am by no means an machine learning expert, and would like to hear what you guys think about this approach. Is there an other approach you think would work better?