AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

LearningRewardMachinesforPartially ObservableReinforcementLearning

Neural Information Processing SystemsFeb-12-2026, 04:38:21 GMT

RL agents learn policies from experience.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

d6f681da2151687df12cc21a1c1e3527-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 04:37:12 GMT

algorithm, function approximation, international conference, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Game Theory (0.93)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
(2 more...)

Add feedback

Planning in entropy-regularized Markov decision processes and games

Jean-Bastien Grill, Omar Darwiche Domingues, Pierre Menard, Remi Munos, Michal Valko

Neural Information Processing SystemsFeb-12-2026, 03:52:48 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, sample complexity, value function, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > Germany > Saxony-Anhalt > Magdeburg (0.04)
Europe > France > Hauts-de-France > Pas-de-Calais (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.42)

Add feedback

Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling

Tengyang Xie, Yifei Ma, Yu-Xiang Wang

Neural Information Processing SystemsFeb-12-2026, 03:43:58 GMT

Solving OPE is often the starting point in many RL applications. To tackle the problem of OPE, the idea of importance sampling (IS) corrects the mismatch in the distributions under the behavior policy and target policy.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(4 more...)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

4fe5149039b52765bde64beb9f674940-Paper.pdf

Neural Information Processing SystemsFeb-12-2026, 03:41:50 GMT

algorithm, learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > France > Hauts-de-France > Pas-de-Calais (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Industry: Automobiles & Trucks (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

d4cc7a2d0d70736e29a3b48c3729bc06-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 03:38:06 GMT

arxiv preprint arxiv, encoder, stage 2, (11 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)
(2 more...)

Add feedback

VRL3: AData-DrivenFrameworkforVisualDeep ReinforcementLearning

Neural Information Processing SystemsFeb-12-2026, 03:38:00 GMT

Our framework has three stages: instage 1,we leverage non-RL datasets (e.g. ImageNet) to learn task-agnostic visual representations; in stage 2, we use offline RL data (e.g. a limited number of expert demonstrations) to convert the task-agnostic representations intomorepowerfultask-specific representations; in stage 3, we fine-tune the agent with online RL.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona > Maricopa County > Phoenix (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

DAC: The Double Actor-Critic Architecture for Learning Options

Shangtong Zhang, Shimon Whiteson

Neural Information Processing SystemsFeb-12-2026, 03:26:23 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, master policy, policy optimization algorithm, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)
Information Technology > Artificial Intelligence > Systems & Languages > Problem-Specific Architectures (0.40)

Add feedback

58a799d16fb0c1f2014e98f4ba972b25-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 03:11:03 GMT

RL that utilize function approximation to generalize observational data to unknown states/actions. The goal of this paper is to study the sample complexity of policy-based RL, which is arguably the simplest setting for RL with function approximation (Kearns et al., 1999; Kakade, 2003).

machine learning, reinforcement learning, trajectory, (15 more...)

Neural Information Processing Systems

Country: