AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee

Neural Information Processing SystemsNov-20-2025, 21:06:17 GMT

We propose stochastic ensemble value expansion (STEVE), a novel model-based technique that addresses this issue. By dynamically interpolating between model rollouts of various horizon lengths for each individual example, STEVE ensures that the model is only utilized when doing so does not introduce significant errors.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Mountain View (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Data-Efficient Hierarchical Reinforcement Learning

Ofir Nachum, Shixiang (Shane) Gu, Honglak Lee, Sergey Levine

Neural Information Processing SystemsNov-20-2025, 20:53:34 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Joint Active Feature Acquisition and Classification with Variable-Size Set Encoding

Hajin Shim, Sung Ju Hwang, Eunho Yang

Neural Information Processing SystemsNov-20-2025, 20:52:44 GMT

Neural Information Processing Systems http://nips.cc/

classifier, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > South Korea (0.04)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)
Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning to Navigate in Cities Without a Map

Piotr Mirowski, Matt Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, koray kavukcuoglu, Andrew Zisserman, Raia Hadsell

Neural Information Processing SystemsNov-20-2025, 20:42:18 GMT

The majority of algorithms involve building an explicit map during an exploration phase and then planning and acting via that representation. In this work, we are interested in pushing the limits of end-to-end deep reinforcement learning for navigation by proposing new methods and demonstrating their performance in large-scale, real-world environments.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning

Ofir Marom, Benjamin Rosman

Neural Information Processing SystemsNov-20-2025, 20:39:30 GMT

Neural Information Processing Systems http://nips.cc/

large language model, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
North America > United States > Massachusetts (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Industry:

Transportation > Passenger (0.55)
Transportation > Ground > Road (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.42)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.41)

Add feedback

Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

Qiang Liu, Lihong Li, Ziyang Tang, Dengyong Zhou

Neural Information Processing SystemsNov-20-2025, 20:37:37 GMT

In this paper, we propose a new off-policy estimator that applies IS directly on the stationary state-visitation distributions to avoid the exploding variance faced by existing methods.

machine learning, reinforcement learning, trajectory, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)

Add feedback

Occam's razor is insufficient to infer the preferences of irrational agents

Stuart Armstrong, Sören Mindermann

Neural Information Processing SystemsNov-20-2025, 20:28:08 GMT

In today's reinforcement learning systems, a simple reward function is often hand-crafted, and still

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
(2 more...)

Add feedback

Exploration in Structured Reinforcement Learning

Jungseul Ok, Alexandre Proutiere, Damianos Tranos

Neural Information Processing SystemsNov-20-2025, 20:24:33 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > Illinois (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)

Add feedback

Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation

Jiaxuan You, Bowen Liu, Zhitao Ying, Vijay Pande, Jure Leskovec

Neural Information Processing SystemsNov-20-2025, 20:23:14 GMT

Generating novel graph structures that optimize given objectives while obeying some given underlying rules is fundamental for chemistry, biology and social science research.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(2 more...)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Genetic-Gated Networks for Deep Reinforcement Learning

Simyung Chang, John Yang, Jaeseok Choi, Nojun Kwak

Neural Information Processing SystemsNov-20-2025, 20:21:57 GMT

Exploiting the short-sighted gradients should be balanced with adequate explorations. Explorations thus should be designed irrelevant to policy gradients in order to guide the policy to unseen states.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.05)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback