AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

5d44ee6f2c3f71b73125876103c8f6c4-AuthorFeedback.pdf

Neural Information Processing SystemsOct-9-2025, 14:34:08 GMT

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.44)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-9-2025, 14:09:05 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This is a very well-written paper that explores the use of weighted importance sampling to speed up learning in off-policy LSTD-type algorithms. The theoretical results are solid and what one would expect. The computational results are striking. The technique could serve as a useful component in design of RL algorithms. Q2: Please summarize your review in 1-2 sentences The paper is very well-written and presents a useful idea validated by striking computational results.

algorithm, function approximation, value function approximation, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)

Add feedback

Weighted importance sampling for off-policy learning with linear function approximation

A. Rupam Mahmood, Hado P. van Hasselt, Richard S. Sutton

Neural Information Processing SystemsOct-9-2025, 14:09:03 GMT

Importance sampling is an essential component of off-policy model-free reinforcement learning algorithms. However, its most effective variant, weighted importance sampling, does not carry over easily to function approximation and, because of this, it is not utilized in existing off-policy learning algorithms. In this paper, we take two steps toward bridging this gap. First, we show that weighted importance sampling can be viewed as a special case of weighting the error of individual training samples, and that this weighting has theoretical and empirical benefits similar to those of weighted importance sampling. Second, we show that these benefits extend to a new weighted-importance-sampling version of off-policy LSTD(). We show empirically that our new WIS-LSTD() algorithm can result in much more rapid and reliable convergence than conventional off-policy LSTD() (Y u 2010, Bertsekas & Y u 2009).

algorithm, function approximation, wis-lstd, (13 more...)

Neural Information Processing Systems

Country: