AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

ThereIsNoTurningBack: ASelf-SupervisedApproachfor Reversibility-AwareReinforcementLearning

Neural Information Processing SystemsFeb-7-2026, 12:17:31 GMT

We propose to learn to distinguish reversible from irreversible actions for better informed decision-making in Reinforcement Learning (RL).

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Add feedback

0e915db6326b6fb6a3c56546980a8c93-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 12:16:47 GMT

agent, aired, plr, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.94)
Leisure & Entertainment > Games (0.93)
Leisure & Entertainment > Sports > Motorsports > Formula One (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

06fc38f5c21ae66ef955e28b7a78ece5-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 12:16:14 GMT

dimension, eluder dimension, international conference, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > UAE (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Workflow (0.46)
Research Report (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Add feedback

AUnifyingViewofOptimisminEpisodic ReinforcementLearning

Neural Information Processing SystemsFeb-7-2026, 11:57:10 GMT

A finite episodic Markov decision process (MDP) is a tuple (S,A,H,α,P,r) where S and A are the finite sets of states and actions withS = |S|,A = |A|, H is the (fixed) episode length andα is the initial state distribution.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

074f42212be2c8ee651db00f17965ec4-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 11:47:47 GMT

category, reflection, reward function, (14 more...)

Neural Information Processing Systems

Country:

Asia > India (0.04)
Europe > Netherlands (0.04)
Asia > Middle East > Jordan (0.04)
Africa > Nigeria (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.67)

Industry:

Health & Medicine > Public Health (1.00)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.46)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
(2 more...)

Add feedback

Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations

Neural Information Processing SystemsFeb-7-2026, 11:46:33 GMT

This is particularly challenging for high-dimensional control tasks, in which there may be a large number of factors that influence the agent's objective.

demonstration, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Utah (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

0dc23b6a0e4abc39904388dd3ffadcd1-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 11:33:50 GMT

algorithm, max null ln, probability, (16 more...)

Neural Information Processing Systems

Genre: Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Add feedback

ProvablyGoodBatchReinforcementLearning WithoutGreatExploration

Neural Information Processing SystemsFeb-7-2026, 11:33:42 GMT

Thisisbecause, in the traditional analysis, the error bound scales up with this ratio. We show that using pessimistic value estimatesin the low-data regions in Bellman optimality and evaluation back-up can yield more adaptive and stronger guarantees when the concentrability assumption does not hold.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
North America > Canada > Alberta (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)

Add feedback

0663a39baab211328fc865f91abc75ab-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 11:33:25 GMT

algorithm, pseudo-stochastic state, transition, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)

Genre: Research Report > New Finding (0.92)

Industry:

Information Technology (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

0ee633a6ade45eab4276352b3ee79c7a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 11:32:41 GMT

A fundamental difference between our learning problem from standard RL problems is that the realized reward feedback from conversion incrementality ismixed and delayed.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Virginia (0.04)

Genre: Research Report (0.46)

Industry:

Marketing (0.47)
Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Add feedback