AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Neural Turing Machine • /r/MachineLearning

#artificialintelligenceJun-18-2016, 20:15:24 GMT

Hi folks, I have a few questions about NTM. Is there any extension to these models? There are extensions, most notably the Reinforcement Learning NTM which uses the Reinforce rule to apply hard attention to the memory, and also models that use the NTM with different access modules. There is an implementation in Torch.

artificial intelligence, machine learning, machinelearning, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.37)

Add feedback

Why Graphics Cards are Hacking the Future

#artificialintelligenceJun-18-2016, 16:50:14 GMT

In 2013, years before DeepMind would unveil an AI that would defeat of one of the world's best Go players, the company published an influential paper showing how "deep reinforcement learning" could be used to teach computers how to play Atari 2600 video games. While GPUs were originally designed as specialized processors optimized to render millions of pixels required for simulating 3D environments, repurposing GPUs to train artificial intelligence algorithms has been commonplace for a while. But it wasn't until I read Andrej Karpathy's recent post on reinforcement learning, however, that something clicked about how interesting this is: Graphics cards, originally designed for human vision of video games, are now being used for computer "vision" of video games. When I was growing up, getting a graphics card was kind of a Big Deal. The first one I got for Christmas in 1997 was a Pure3D Canopus Voodoo card based on the 3dfx chipset. It let me run Quake smoothly on my Pentium Compaq, which was a top priority of my life at the time.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

This robot chooses which human victims it wants to inflict pain on

#artificialintelligenceJun-14-2016, 10:15:47 GMT

The threat of killer robots may sound a little far-fetched but this latest'harmful robot' suggests we may have taken a step closer to this dystopian reality. Roboticist Alexander Reben from the University of Berkeley, California, has created a bot called "The First Law" that is capable of pricking a finger, but is programmed to choose not to every time if it means avoiding being switched off. Ultimately, it can decide whether or not to inflict pain to serve its own interest. The robot is named after the first law in a set of rules devised by sci-fi author Isaac Asimov, which - quoted as being from the Handbook of Robotics, 2058 AD – states "a robot may not injure a human being or, through inaction, allow a human being to come to harm". Reben's research paper explains how the robot operates in relation to "reinforcement learning agents" and how they are unlikely to behave optimally all the time.

inflict pain, machine learning, reinforcement learning, (9 more...)

#artificialintelligence

Country:

North America > United States > California > Alameda County > Berkeley (0.25)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)

Genre: Instructional Material (0.56)

Industry: Government > Military (0.32)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.59)

Add feedback

Model-Free Episodic Control

Blundell, Charles, Uria, Benigno, Pritzel, Alexander, Li, Yazhe, Ruderman, Avraham, Leibo, Joel Z, Rae, Jack, Wierstra, Daan, Hassabis, Demis

arXiv.org Machine LearningJun-14-2016

State of the art deep reinforcement learning algorithms take many millions of interactions to attain human-level performance. Humans, on the other hand, can very quickly exploit highly rewarding nuances of an environment upon first discovery. In the brain, such rapid learning is thought to depend on the hippocampus and its capacity for episodic memory. Here we investigate whether a simple model of hippocampal episodic control can learn to solve difficult sequential decision-making tasks. We demonstrate that it not only attains a highly rewarding strategy significantly faster than state-of-the-art deep reinforcement learning algorithms, but also achieves a higher overall reward on some of the more challenging domains.

episodic controller, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1606.0446

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Games > Computer Games (0.69)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

This Week's Awesome Stories From Around the Web (Through June 11th)

#artificialintelligenceJun-11-2016, 16:51:57 GMT

ROBOTICS: Vyo Is a Fascinating and Unique Take on Social Domestic Robots Evan Ackerman IEEE Spectrum "Vyo is'a personal assistant serving as a centralized interface for smart home devices.' Nothing new there, but what sets Vyo apart is how you interact with it: it combines non-anthropomorphic design with anthropomorphic expressiveness and a tactile object-based control system into a social robot that's totally, adorably different." ARTIFICIAL INTELLIGENCE: The AI Machines Undergoing Behavioral Psychology Tests Technology Review "The team says the best performing AI system uses deep reinforcement learning enhanced with additional memory. These machines retrieve relevant memories based on the context in which they were stored and in which the device finds itself. That's different from many existing memory systems that do not rely on context for memory retrieval." INTERNET: A Computer Tried (and Failed) to Write This Article Adrienne Lafrance The Atlantic "Here I am, a human, writing a story that was assigned to a machine.

awesome story, machine learning, reinforcement learning, (5 more...)

#artificialintelligence

Industry: Information Technology (0.54)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.58)

Add feedback

[Q] Temporal Difference Learning in POMDP's • /r/MachineLearning

@machinelearnbotJun-10-2016, 15:08:35 GMT

The environment is partially observable and will never be fully observable, due to a lack of information. Does anyone know of any models suitable for learning such a value function?

artificial intelligence, machine learning, reinforcement learning, (4 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.57)

Add feedback

Google developing 'kill switch' to stop robot uprising against humans

#artificialintelligenceJun-9-2016, 10:06:01 GMT

They referenced a robot that learned how to pause a game of Tetris to avoid losing, adding that AIs are "unlikely to behave optimally all the time". "We have proposed a framework to allow a human operator to repeatedly safely interrupt a reinforcement learning agent while making sure the agent will not learn to prevent or induce these interruptions," the paper concluded. "Safe interruptibility can be useful to take control of a robot that is misbehaving and may lead to irreversible consequences, or to take it out of a delicate situation, or even to temporarily use it to achieve a task it did not learn to perform or would not normally receive rewards for this."

artificial intelligence, machine learning, reinforcement learning, (4 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)

Add feedback

What are the business applications of deep reinforcement learning? • /r/MachineLearning

@machinelearnbotJun-9-2016, 05:01:10 GMT

What are the business applications of deep reinforcement learning? Besides being great for playing games and useful to understand how intelligence works, how is deep reinforcement learning being used in businesses?

artificial intelligence, deep reinforcement learning, machine learning, (2 more...)

@machinelearnbot

Industry: Information Technology > Software (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Deep Successor Reinforcement Learning

Kulkarni, Tejas D., Saeedi, Ardavan, Gautam, Simanta, Gershman, Samuel J.

arXiv.org Machine LearningJun-8-2016

Learning robust value functions given raw observations and rewards is now possible with model-free and model-based deep reinforcement learning algorithms. There is a third alternative, called Successor Representations (SR), which decomposes the value function into two components -- a reward predictor and a successor map. The successor map represents the expected future state occupancy from any given state and the reward predictor maps states to scalar rewards. The value function of a state can be computed as the inner product between the successor map and the reward weights. In this paper, we present DSR, which generalizes SR within an end-to-end deep reinforcement learning framework. DSR has several appealing properties including: increased sensitivity to distal reward changes due to factorization of reward and world dynamics, and the ability to extract bottleneck states (subgoals) given successor maps trained under a random policy. We show the efficacy of our approach on two diverse environments given raw pixel observations -- simple grid-world domains (MazeBase) and the Doom game engine.

arxiv preprint arxiv, machine learning, reinforcement learning, (13 more...)

arXiv.org Machine Learning

1606.02396

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Games (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Deep Reinforcement Learning with a Natural Language Action Space

He, Ji, Chen, Jianshu, He, Xiaodong, Gao, Jianfeng, Li, Lihong, Deng, Li, Ostendorf, Mari

arXiv.org Artificial IntelligenceJun-8-2016

This paper introduces a novel architecture for reinforcement learning with deep neural networks designed to handle state and action spaces characterized by natural language, as found in text-based games. Termed a deep reinforcement relevance network (DRRN), the architecture represents action and state spaces with separate embedding vectors, which are combined with an interaction function to approximate the Q-function in reinforcement learning. We evaluate the DRRN on two popular text games, showing superior performance over other deep Q-learning architectures. Experiments with paraphrased action descriptions show that the model is extracting meaning rather than simply memorizing strings of text.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1511.04636

Country: North America > United States > Washington > King County (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback