AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

PAC Reinforcement Learning with Rich Observations

Krishnamurthy, Akshay, Agarwal, Alekh, Langford, John

arXiv.org Machine LearningOct-28-2016

We propose and study a new model for reinforcement learning with rich observations, generalizing contextual bandits to sequential decision making. These models require an agent to take actions based on observations (features) with the goal of achieving long-term performance competitive with a large set of policies. To avoid barriers to sample-efficient learning associated with large observation spaces and general POMDPs, we focus on problems that can be summarized by a small number of hidden states and have long-term rewards that are predictable by a reactive function class. In this setting, we design and analyze a new reinforcement learning algorithm, Least Squares Value Elimination by Exploration. We prove that the algorithm learns near optimal behavior after a number of episodes that is polynomial in all relevant parameters, logarithmic in the number of policies, and independent of the size of the observation space. Our result provides theoretical justification for reinforcement learning with function approximation.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1602.02722

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.89)

Add feedback

5 EBooks to Read Before Getting into A Machine Learning Career

#artificialintelligenceOct-27-2016, 01:50:52 GMT

Don't know where to start? If you are looking for something more, you could look here for an overview of MOOCs and online lectures from freely-available university lectures. Of course, nothing substitutes rigorous formal education, but let's say that isn't in the cards for whatever reason. Not all machine learning positions require a PhD; it really depends where on the machine learning spectrum one wants to fit in. Check out this motivating and inspirational post, the author of which went from little understanding of machine learning to actively and effectively utilizing techniques in their job within a year.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

#artificialintelligence

Country: North America > United States > Minnesota (0.05)

Genre: Instructional Material > Course Syllabus & Notes (0.50)

Industry:

Education > Educational Setting > Online (0.70)
Education > Educational Setting > Higher Education (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.54)

Add feedback

paulhendricks/gym-R

#artificialintelligenceOct-25-2016, 16:10:37 GMT

OpenAI Gym is a open-source Python toolkit for developing and comparing reinforcement learning algorithms. This R package is a wrapper for the OpenAI Gym API, and enables access to an ever-growing variety of environments. If you encounter a clear bug, please file a minimal reproducible example on github.

large language model, paulhendrick gym-r, reinforcement learning, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.83)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.83)

Add feedback

Simple reinforcement learning methods to learn CartPole

#artificialintelligenceOct-25-2016, 09:05:40 GMT

I've been experimenting with OpenAI gym recently, and one of the simplest environments is CartPole. The problem consists of balancing a pole connected with one joint on top of a moving cart. The only actions are to add a force of -1 or 1 to the cart, pushing it left or right. In this post, I will be going over some of the methods described in the CartPole request for research, including implementations and some intuition behind how they work. In CartPole's environment, there are four observations at any given state, representing information such as the angle of the pole and the position of the cart.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Add feedback

[R] Paper Summary: Human-Level Control Through Deep Reinforcement Learning (DeepMind, Nature) • /r/MachineLearning

@machinelearnbotOct-24-2016, 18:50:26 GMT

Didn't click the link but didn't this paper come out a long time ago?

artificial intelligence, deep learning, machine learning, (6 more...)

@machinelearnbot

Industry: Media > News (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning

White, Martha, White, Adam

arXiv.org Artificial IntelligenceOct-24-2016

One of the main obstacles to broad application of reinforcement learning methods is the parameter sensitivity of our core learning algorithms. In many large-scale applications, online computation and function approximation represent key strategies in scaling up reinforcement learning algorithms. In this setting, we have effective and reasonably well understood algorithms for adapting the learning-rate parameter, online during learning. Such meta-learning approaches can improve robustness of learning and enable specialization to current task, improving learning speed. For temporal-difference learning algorithms which we study here, there is yet another parameter, $\lambda$, that similarly impacts learning speed and stability in practice. Unfortunately, unlike the learning-rate parameter, $\lambda$ parametrizes the objective function that temporal-difference methods optimize. Different choices of $\lambda$ produce different fixed-point solutions, and thus adapting $\lambda$ online and characterizing the optimization is substantially more complex than adapting the learning-rate parameter. There are no meta-learning method for $\lambda$ that can achieve (1) incremental updating, (2) compatibility with function approximation, and (3) maintain stability of learning under both on and off-policy sampling. In this paper we contribute a novel objective function for optimizing $\lambda$ as a function of state rather than time. We derive a new incremental, linear complexity $\lambda$-adaption algorithm that does not require offline batch updating or access to a model of the world, and present a suite of experiments illustrating the practicality of our new algorithm in three different settings. Taken together, our contributions represent a concrete step towards black-box application of temporal-difference learning methods in real world problems.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

1607.00446

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

5 EBooks to Read Before Getting into A Machine Learning Career

#artificialintelligenceOct-21-2016, 21:21:20 GMT

Note that, while there are numerous machine learning ebooks available for free online, including many which are very well-known, I have opted to move past these "regulars" and seek out lesser-known and more niche options for readers. Don't know where to start? If you are looking for something more, you could look here for an overview of MOOCs and online lectures from freely-available university lectures. Of course, nothing substitutes rigorous formal education, but let's say that isn't in the cards for whatever reason. Not all machine learning positions require a PhD; it really depends where on the machine learning spectrum one wants to fit in.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

#artificialintelligence

Country: North America > United States > Minnesota (0.05)

Genre: Instructional Material > Course Syllabus & Notes (0.50)

Industry:

Education > Educational Setting > Online (0.70)
Education > Educational Setting > Higher Education (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)

Add feedback

Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation

Parisi, Simone, Pirotta, Matteo, Restelli, Marcello

Journal of Artificial Intelligence ResearchOct-21-2016

Many real-world control applications, from economics to robotics, are characterized by the presence of multiple conflicting objectives. In these problems, the standard concept of optimality is replaced by Pareto-optimality and the goal is to find the Pareto frontier, a set of solutions representing different compromises among the objectives. Despite recent advances in multi-objective optimization, achieving an accurate representation of the Pareto frontier is still an important challenge. In this paper, we propose a reinforcement learning policy gradient approach to learn a continuous approximation of the Pareto frontier in multi-objective Markov Decision Problems (MOMDPs). Differently from previous policy gradient algorithms, where n optimization routines are executed to have n solutions, our approach performs a single gradient ascent run, generating at each step an improved continuous approximation of the Pareto frontier. The idea is to optimize the parameters of a function defining a manifold in the policy parameters space, so that the corresponding image in the objectives space gets as close as possible to the true Pareto frontier. Besides deriving how to compute and estimate such gradient, we will also discuss the non-trivial issue of defining a metric to assess the quality of the candidate Pareto frontiers. Finally, the properties of the proposed approach are empirically evaluated on two problems, a linear-quadratic Gaussian regulator and a water reservoir control task.

frontier, machine learning, reinforcement learning, (20 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4961

AI Access Foundation

11026

Journal of Artificial Intelligence Research

Country:

Europe > Germany (0.28)
North America > United States > New York (0.14)
North America > United States > Wisconsin (0.14)
(9 more...)

Genre:

Research Report (0.67)
Overview (0.46)

Industry: Energy > Oil & Gas (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Quantum artificial intelligence could lead to super-smart machines

#artificialintelligenceOct-19-2016, 17:52:20 GMT

Quantum physics has some spooky, anti-intuitive effects, but it could also be essential to how actual intuition works, at least in regards to artificial intelligence. In a new study, researcher Vedran Dunjko and co-authors applied a quantum analysis to a field within artificial intelligence called reinforcement learning, which deals with how to program a machine to make appropriate choices to maximize a cumulative reward. The field is surprisingly complex and must take into account everything from game theory to information theory. Dunjko and his team found that quantum effects, when applied to reinforcement learning in artificial intelligence systems, could provide quadratic improvements in learning efficiency, reports Phys.org . Exponential improvements might even be possible over short-term performance tasks.

artificial intelligence, machine learning, reinforcement learning, (6 more...)

#artificialintelligence

Genre: Research Report (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

Towards deep symbolic reinforcement learning

#artificialintelligenceOct-16-2016, 11:30:25 GMT

Every now and then I read a paper that makes a really strong connection with me, one where I can't stop thinking about the implications and I can't wait to share it with all of you. For me, this is one such paper. In the great see-saw of popularity for artificial intelligence techniques, symbolic reasoning and neural networks have taken turns, each having their dominant decade(s). The popular wisdom is that data-driven learning techniques (machine learning) won. Symbolic reasoning systems were just too hard and fragile to be successful at scale.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback