AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

[P] A Collection of Minimal RL Algorithms (now with advanced examples) • r/MachineLearning

@machinelearnbotJun-29-2017, 03:00:15 GMT

We have been working on a book covering the tutorials of code examples in the link. Now we are in the stage of finalizing the code while editing. Please let me know if we missed anything! It will be published in Korea in a month and will be translated into English (hopefully soon).

machine learning, minimal rl algorithm, reinforcement learning, (4 more...)

@machinelearnbot

Industry: Media > News (0.40)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Temporal-related Convolutional-Restricted-Boltzmann-Machine capable of learning relational order via reinforcement learning procedure?

Wang, Zizhuang

arXiv.org Machine LearningJun-24-2017

In this article, we extend the conventional framework of convolutional-Restricted-Boltzmann-Machine to learn highly abstract features among abitrary number of time related input maps by constructing a layer of multiplicative units, which capture the relations among inputs. In many cases, more than two maps are strongly related, so it is wise to make multiplicative unit learn relations among more input maps, in other words, to find the optimal relational-order of each unit. In order to enable our machine to learn relational order, we developed a reinforcement-learning method whose optimality is proven to train the network.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

1706.08001

Genre: Research Report (0.50)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)

Add feedback

Tesla's new AI guru will help its cars learn for themselves

#artificialintelligenceJun-23-2017, 07:50:12 GMT

Elon Musk has hired a new director of AI research at Tesla, and it may signal a plan to rethink the way its automated driving works. This week, Musk poached Andrej Karpathy, an expert on vision, deep learning, and reinforcement learning, from OpenAI, a nonprofit that Musk and others are funding that's dedicated to "discovering and enacting the path to safe artificial general intelligence." Karpathy, who will apparently report directly to Musk, is a rising star in the world of AI, having studied at Stanford with Fei-Fei Li, a leading AI expert who is now the chief scientist of Google Cloud. Li is famous in tech circles for having developed a data set of images that helped inspire a breakthrough in machine vision. Many have pointed to Karpathy's expertise in computer vision as a key asset for Tesla, and that's true.

machine learning, reinforcement learning, tesla, (11 more...)

#artificialintelligence

Genre: Research Report > Promising Solution (0.40)

Industry:

Automobiles & Trucks (0.76)
Transportation > Ground > Road (0.56)
Information Technology > Robotics & Automation (0.56)
Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

Market Interfaces for Electric Vehicle Charging

Stein, Sebastian, Gerding, Enrico H., Nedea, Adrian, Rosenfeld, Avi, Jennings, Nicholas R.

Journal of Artificial Intelligence ResearchJun-22-2017

We consider settings where owners of electric vehicles (EVs) participate in a market mechanism to charge their vehicles. Existing work on such mechanisms has typically assumed that participants are fully rational and can report their preferences accurately via some interface to the mechanism or to a software agent participating on their behalf. However, this may not be reasonable in settings with non-expert human end-users.Thus, our overarching aim in this paper is to determine experimentally if a fully expressive market interface that enables accurate preference reports is suitable for the EV charging domain, or, alternatively, if a simpler, restricted interface that reduces the space of possible options is preferable. In doing this, we measure the performance of an interface both in terms of how it helps participants maximise their utility and how it affects deliberation time. Our secondary objective is to contrast two different types of restricted interfaces that vary in how they restrict the space of preferences that can be reported. To enable this analysis, we develop a novel game that replicates key features of an abstract EV charging scenario. In two experiments with over 300 users, we show that restricting the users' preferences significantly reduces the time they spend deliberating (by up to half in some cases). An extensive usability survey confirms that this restriction is furthermore associated with a lower perceived cognitive burden on the users. More surprisingly, at the same time, using restricted interfaces leads to an increase in the users' performance compared to the fully expressive interface (by up to 70%). We also show that some restricted interfaces have the desirable effect of reducing the energy consumption of their users by up to 20% while achieving the same utility as other interfaces. Finally, we find that a reinforcement learning agent displays similar performance trends to human users, enabling a novel methodology for evaluating market interfaces.

experiment, interface, journey, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.5387

AI Access Foundation

11065

Journal of Artificial Intelligence Research

Country:

North America > United States > New York > New York County > New York City (0.05)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
South America > Brazil (0.04)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Signaling Game Approach to Databases Querying and Interaction

McCamish, Ben, Termehchy, Arash, Touri, Behrouz

arXiv.org Artificial IntelligenceJun-22-2017

As most database users cannot precisely express their information needs, it is challenging for database management systems to understand them. We propose a novel formal framework for representing and understanding information needs in database querying and exploration. Our framework considers querying as a collaboration between the user and the database management system to establish a it mutual language for representing information needs. We formalize this collaboration as a signaling game, where each mutual language is an equilibrium for the game. A query interface is more effective if it establishes a less ambiguous mutual language faster. We discuss some equilibria, strategies, and the convergence in this game. In particular, we propose a reinforcement learning mechanism and analyze it within our framework. We prove that this adaptation mechanism for the query interface improves the effectiveness of answering queries stochastically speaking, and converges almost surely. We extend out results for the cases that the user also modifies her strategy during the interaction.

machine learning, natural language, reinforcement learning, (22 more...)

arXiv.org Artificial Intelligence

1603.04068

Country: North America > United States (0.93)

Genre: Research Report (0.81)

Industry: Leisure & Entertainment > Games (0.48)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Game Theory (1.00)
Information Technology > Databases (1.00)
(3 more...)

Add feedback

Statistical Mechanics of Node-perturbation Learning with Noisy Baseline

Hara, Kazuyuki, Katahira, Kentaro, Okada, Masato

arXiv.org Machine LearningJun-20-2017

Node-perturbation learning is a type of statistical gradient descent algorithm that can be applied to problems where the objective function is not explicitly formulated, including reinforcement learning. It estimates the gradient of an objective function by using the change in the object function in response to the perturbation. The value of the objective function for an unperturbed output is called a baseline. Cho et al. proposed node-perturbation learning with a noisy baseline. In this paper, we report on building the statistical mechanics of Cho's model and on deriving coupled differential equations of order parameters that depict learning dynamics. We also show how to derive the generalization error by solving the differential equations of order parameters. On the basis of the results, we show that Cho's results are also apply in general cases and show some general performances of Cho's model.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

doi: 10.7566/JPSJ.86.024002

1706.06953

Country: Asia > Japan > Honshū > Kantō (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Classifying Options for Deep Reinforcement Learning

Arulkumaran, Kai, Dilokthanakul, Nat, Shanahan, Murray, Bharath, Anil Anthony

arXiv.org Artificial IntelligenceJun-19-2017

In this paper we combine one method for hierarchical reinforcement learning - the options framework - with deep Q-networks (DQNs) through the use of different "option heads" on the policy network, and a supervisory network for choosing between the different options. We utilise our setup to investigate the effects of architectural constraints in subtasks with positive and negative transfer, across a range of network capacities. We empirically show that our augmented DQN has lower sample complexity when simultaneously learning subtasks with negative transfer, without degrading performance when learning subtasks with positive transfer.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

1604.08153

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reinforcement Learning in Rich-Observation MDPs using Spectral Methods

Azizzadenesheli, Kamyar, Lazaric, Alessandro, Anandkumar, Animashree

arXiv.org Artificial IntelligenceJun-18-2017

Designing effective exploration-exploitation algorithms in Markov decision processes (MDPs) with large state-action spaces is the main challenge in reinforcement learning (RL). In fact, the learning performance degrades with the number of states and actions in the MDP. However, MDPs often exhibit a low-dimensional latent structure in practice, where a small hidden state is observable through a possibly large number of observations. In this paper, we study the setting of rich-observation Markov decision processes (\richmdp), where hidden states are mapped to observations through an injective mapping, so that an observation can be generated by only one hidden state. While this mapping is unknown a priori, we introduce a spectral decomposition method that consistently estimates how observations are clustered in the hidden states. The estimated clustering is then integrated into an optimistic algorithm for RL (UCRL), which operates on the smaller clustered space. The resulting algorithm proceeds through phases and we show that its per-step regret (i.e., the difference in cumulative reward between the algorithm and the optimal policy) decreases as more observations are clustered together and finally, matches the (ideal) performance of an RL algorithm running directly on the hidden MDP.

algorithm, artificial intelligence, upstream oil & gas, (18 more...)

arXiv.org Artificial Intelligence

1611.03907

Country:

North America > United States > California (0.14)
Europe > France > Hauts-de-France (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Dex: Incremental Learning for Complex Environments in Deep Reinforcement Learning

Erickson, Nick, Zhao, Qi

arXiv.org Machine LearningJun-18-2017

This paper introduces Dex, a reinforcement learning environment toolkit specialized for training and evaluation of continual learning methods as well as general reinforcement learning problems. We also present the novel continual learning method of incremental learning, where a challenging environment is solved using optimal weight initialization learned from first solving a similar easier environment. We show that incremental learning can produce vastly superior results than standard methods by providing a strong baseline method across ten Dex environments. We finally develop a saliency method for qualitative analysis of reinforcement learning, which shows the impact incremental learning has on network attention.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1706.05749

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Industry:

Education (0.69)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Intelligent Bits: 16 June 2017

#artificialintelligenceJun-16-2017, 18:22:49 GMT

Facebook fighting extremism with AI -- "The problem, as usual, is determining what is extremist, and what isn't, and it goes further than just jihadists," he said. "Are they just talking about ISIS and Al Qaeda, or are they going to go further to deal with white nationalism and neo-Nazi movements?" AI is big business -- Element AI raises a whopping $102 million to bridge the gap between the haves and have-nots of AI. "Intuitive physics" -- DeepMind claims progress towards AI with a better sense of context and "intuitive physics" via relational reasoning and visual prediction, but obstacles to human-like intelligence remain. Alternative schema -- While deep reinforcement learning (DRL) is all the rage right now, some organizations like Vicarious have taken alternative approaches such as their Schema Networks, which have outperformed some DRL nets albeit with some debate and controversy. Facebook fighting extremism with AI -- "The problem, as usual, is determining what is extremist, and what isn't, and it goes further than just jihadists," he said.

intelligent bit, machine learning, reinforcement learning, (14 more...)

#artificialintelligence

Industry: Law Enforcement & Public Safety > Terrorism (0.67)

Technology:

Information Technology > Communications > Social Media (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback