AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

ed3cd2520148b577039adfade82a5566-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 17:20:31 GMT

dynamic gap, learning, simulator, (11 more...)

Neural Information Processing Systems

Country:

North America > United States (0.05)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Beijing > Beijing (0.04)
Asia > India (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Object-CategoryAwareReinforcementLearning

Neural Information Processing SystemsFeb-12-2026, 16:40:27 GMT

Reinforcement Learning (RL) has achievedimpressiveprogress inrecent years, such asresults in Atari [24] and Go [28] in which RL agents even perform better than human beings.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.05)

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning

Yonathan Efroni, Gal Dalal, Bruno Scherrer, Shie Mannor

Neural Information Processing SystemsFeb-12-2026, 16:33:26 GMT

Our framework is the infinite-horizon discounted Markov Decision Process (MDP).

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > France (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Add feedback

An Off-policy Policy Gradient Theorem Using Emphatic Weightings

Ehsan Imani, Eric Graves, Martha White

Neural Information Processing SystemsFeb-12-2026, 16:32:08 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, emphatic weighting, gradient, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.15)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence Carlo Alfano Department of Statistics University of Oxford

Neural Information Processing SystemsFeb-12-2026, 16:30:57 GMT

In this work, we introduce a framework for policy optimization based on mirror descent that naturally accommodates general parameterizations. The policy class induced by our scheme recovers known classes, e.g., softmax, and generates new ones depending on the choice of mirror map.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.50)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Russia (0.04)
(3 more...)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Add feedback

Hierarchical Decision Making by Generating and Following Natural Language Instructions

Hengyuan Hu, Denis Yarats, Qucheng Gong, Yuandong Tian, Mike Lewis

Neural Information Processing SystemsFeb-12-2026, 16:21:20 GMT

We explore representing complex actions as natural language instructions.

machine learning, natural language, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > Canada (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Overview (0.68)

Industry:

Leisure & Entertainment > Games > Computer Games (0.71)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.46)

Add feedback

ComputationallyEfficientHorizon-Free ReinforcementLearningforLinearMixtureMDPs

Neural Information Processing SystemsFeb-12-2026, 16:12:01 GMT

How to design efficient algorithms is a central problem for reinforcement learning (RL).

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Add feedback

ebba182cb97864368fdb6ae00773a5e4-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 16:11:58 GMT

algorithm, bandit, linear mixture mdp, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Asia > Japan > Honshū > Tōhoku (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

DeepReinforcementLearninginaHandfulofTrials usingProbabilisticDynamicsModels

Neural Information Processing SystemsFeb-12-2026, 16:11:22 GMT

In this paper, we take a step toward narrowing the gap between model-based and model-free RL methods.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

6191ab7080c840f67eaf5dff7d5edfcb-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 16:10:47 GMT

Diversity in equally-performing policies.We show that different neighborhoods correspond to different post-update return distributions and agent behaviors. We discover that at equal average returns, different policies obtained by the same deep RL algorithm may in fact have substantially different distributional profiles, as measured by statistics of the post-update return distribution.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Louisiana (0.04)
North America > Canada > Quebec (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)

Add feedback