AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

89c61fce5a8b73871d1c4073f486b134-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 14:22:37 GMT

markov chain, polynomial, scalable mdp, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey (0.04)
North America > United States > Massachusetts (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

WhatDidYouThinkWould Happen?Explaining AgentBehaviourthroughIntendedOutcomes

Neural Information Processing SystemsFeb-10-2026, 14:14:23 GMT

This allows for local interpretation of the agent's intention based on its behavioural policy.

explanation, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

ab6439fa2daf0246f92eea433bca5ac4-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 14:14:09 GMT

algorithm, bellman equation, equation, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

ExponentialBellmanEquationandImprovedRegret BoundsforRisk-SensitiveReinforcementLearning

Neural Information Processing SystemsFeb-10-2026, 14:14:05 GMT

We study risk-sensitive reinforcement learning (RL) based on the entropic risk measure. Although existing works haveestablished non-asymptotic regret guarantees for this problem, they leave open an exponential gap between the upper and lower bounds. We identify the deficiencies in existing algorithms and their analysis that result in such a gap. To remedy these deficiencies, we investigate a simple transformation of the risk-sensitive Bellman equations, which we call theexponentialBellmanequation.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)

Add feedback

d55cbf210f175f4a37916eafe6c04f0d-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 14:12:38 GMT

algorithm, bail, batch, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona > Maricopa County > Phoenix (0.04)
North America > Canada (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

ab452534c5ce28c4fbb0e102d4a4fb2e-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 14:12:24 GMT

international conference, platform, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.68)

Industry: Information Technology > Services (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.98)

Add feedback

3b3889d313ba9476c12c2d77ea66b24f-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 13:58:08 GMT

dataset, history length, trajectory, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Montana (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.46)

Industry:

Health & Medicine (0.47)
Leisure & Entertainment > Games (0.32)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Add feedback

The MAGICAL Benchmark for Robust Imitation

Neural Information Processing SystemsFeb-10-2026, 13:44:36 GMT

The robot could learn from these demonstrations to complete the tasks autonomously. For IL algorithms to be useful, however, they must be able to learn how to perform tasks from few demonstrations. A domestic robot wouldn't be very helpful if it required thirty demonstrations before it figured out that you are deliberately washing your purple cravat

machine learning, reinforcement learning, variant, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Instructional Material (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

3af25aa3de8b7b02ddbd1b6be5031be8-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 13:16:12 GMT

dataset, isw-bc, nbcu, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (0.67)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

2ea07a4acbf7e38913062fd69a70805f-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 13:13:45 GMT

By identifying an instrumental variable correlated with the variableX butunrelated tothe confounders, researchers can isolate the exogenous variation inX and estimate acausal relationship betweenX andY.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback