AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Natural Policy Gradient Primal-Dual Method for Constrained Markov Decision Processes

Neural Information Processing SystemsFeb-8-2026, 14:45:34 GMT

We study sequential decision-making problems in which each agent aims to maximize the expected total reward while satisfying a constraint on the expected total utility. We employ the natural policy gradient method to solve the discounted infinite-horizon Constrained Markov Decision Processes (CMDPs) problem. Specifically, we propose a new Natural Policy Gradient Primal-Dual (NPG-PD) method for CMDPs which updates the primal variable via natural policy gradient ascent and the dual variable via projected sub-gradient descent.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
North America > United States > Illinois (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Industry:

Health & Medicine (0.93)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)

Add feedback

GRASP: NavigatingRetrosyntheticPlanningwith Goal-drivenPolicy

Neural Information Processing SystemsFeb-8-2026, 14:27:06 GMT

Retrosynthetic planning occupies a crucial position in synthetic chemistry and, accordingly, drug discovery, which aims to find synthetic pathways of a target molecule through a sequential decision-making process on a set of feasible reactions.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Zhejiang Province (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Industry: Materials > Chemicals (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

4dea382d82666332fb564f2e711cbc71-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 14:26:55 GMT

argument, latexit sha1, theorem, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
South America > Brazil (0.04)
North America > United States > North Carolina > Wake County > Morrisville (0.04)
(9 more...)

Genre:

Research Report (0.68)
Instructional Material > Course Syllabus & Notes (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

17af43527227c5c96db0f8d4c6aadc4e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 14:16:24 GMT

agent, gradient, world model, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

5d44ee6f2c3f71b73125876103c8f6c4-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 14:06:17 GMT

algorithm, contraction, corollary 2, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)
(5 more...)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Finite-SampleAnalysisofContractiveStochastic ApproximationUsingSmoothConvexEnvelopes

Neural Information Processing SystemsFeb-8-2026, 14:06:10 GMT

Inthispaper,weconsider an SA involving a contraction mapping with respect to an arbitrary norm, and show its finite-sample error bounds while using different stepsizes.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Champaign (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Few

Neural Information Processing SystemsFeb-8-2026, 14:04:32 GMT

Deep reinforcement learning (RL) algorithms have demonstrated promising results on a variety of complex tasks, such as robotic manipulation [22, 13] and strategy games [27, 38].

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Maximum-Likelihood InverseReinforcementLearning withFinite-TimeGuarantees

Neural Information Processing SystemsFeb-8-2026, 14:03:46 GMT

Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy that best fits observed sequences of states and actions implemented byanexpert.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: