AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Temporal Regularization for Markov Decision Process

Pierre Thodoroff, Audrey Durand, Joelle Pineau, Doina Precup

Neural Information Processing SystemsFeb-12-2026, 18:55:45 GMT

Yetinreinforcementlearning,duetothenatureofthe Bellman equation, there isanopportunity toalsoexploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Exponentially Weighted Imitation Learning for Batched Historical Data

Qing Wang, Jiechao Xiong, Lei Han, peng sun, Han Liu, Tong Zhang

Neural Information Processing SystemsFeb-12-2026, 18:47:41 GMT

We consider deep policy learning with only batched historical trajectories. The main challenge of this problem is that the learner no longer has a simulator or "environment oracle" as in most reinforcement learning settings.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

8252831b9fce7a49421e622c14ce0f65-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 18:45:39 GMT

algorithm, dtrs, online rl, (13 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Add feedback

efe36e55d80a94d1726f660b8d237a0f-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 18:38:30 GMT

locality and sparsity structure, predicate, representation, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.69)
(4 more...)

Add feedback

LAPO: Latent-VariableAdvantage-WeightedPolicy OptimizationforOfflineReinforcementLearning

Neural Information Processing SystemsFeb-12-2026, 18:26:51 GMT

But in practice, it requires querying the behavior policy which is unknown, and using an erroneous approximation of the behavior policy can negatively affect the performance ([39]).

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > United States (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL Chen Sun

Neural Information Processing SystemsFeb-12-2026, 18:26:03 GMT

In real life, success is often contingent upon multiple critical steps that are distant in time from each other and from the final reward.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Teaching Inverse Reinforcement Learners via Features and Demonstrations

Luis Haug, Sebastian Tschiatschek, Adish Singla

Neural Information Processing SystemsFeb-12-2026, 18:17:42 GMT

Weintroduceanaturalquantity,the teaching risk, which measures the potential suboptimality of policies that look optimal to the learner in this setting. We show that bounds on the teaching risk guarantee that the learner is able to find a near-optimal policy using standard algorithms basedoninversereinforcement learning. Basedonthesefindings, we suggest a teaching scheme in which the expert can decrease the teaching risk by updating the learner's worldview, and thus ultimately enable her to find a near-optimalpolicy.

machine learning, reinforcement learning, teaching risk, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Saarland > Saarbrücken (0.04)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

TowardsVersatileEmbodiedNavigation

Neural Information Processing SystemsFeb-12-2026, 18:16:10 GMT

VIENNA performs robust on unseen environments, through learning task-shared, general representations.

machine learning, natural language, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.34)
Asia > China (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

eda9523faa5e7191aee1c2eaff669716-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 17:27:04 GMT

Though promising results have been reported on some RL application domains, policies learned with such representations usually fail to generalize well in a complex environment because minimizing a reconstruction loss may potentially introduce local (visual) features with task-irrelevant information.

machine learning, reinforcement learning, urlhttp, (15 more...)

Neural Information Processing Systems

Country: