AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Simple random search of static linear policies is competitive for reinforcement learning

Horia Mania, Aurelia Guy, Benjamin Recht

Neural Information Processing SystemsFeb-13-2026, 07:47:30 GMT

Model-free reinforcement learning aims to offer off-the-shelf solutions for controlling dynamical systems without requiring models of the system dynamics.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Regret Minimization for Reinforcement Learning with Vectorial Feedback and Complex Objectives

Wang Chi Cheung

Neural Information Processing SystemsFeb-13-2026, 07:22:20 GMT

Neural Information Processing Systems http://nips.cc/

agent, proceedings, tfw-ucrl2, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

Learning Task Specifications from Demonstrations

Marcell Vazquez-Chanlatte, Susmit Jha, Ashish Tiwari, Mark K. Ho, Sanjit Seshia

Neural Information Processing SystemsFeb-13-2026, 07:02:15 GMT

Neural Information Processing Systems http://nips.cc/

concept class, demonstration, specification, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)

Add feedback

Transfer of Deep Reactive Policies for MDP Planning

Aniket (Nick) Bajpai, Sankalp Garg, None

Neural Information Processing SystemsFeb-13-2026, 07:00:17 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Text-AwareDiffusionforPolicyLearning

Neural Information Processing SystemsFeb-13-2026, 06:42:00 GMT

Training an agent to achieve particular goals or perform desired behaviors is often accomplished through reinforcement learning, especially in the absence of expert demonstrations. However, supporting novel goals or behaviors through reinforcement learning requires the ad-hoc design of appropriate reward functions, which quickly becomes intractable. Toaddress thischallenge, wepropose Text-AwareDiffusion forPolicyLearning (TADPoLe), which uses apretrained, frozen text-conditioned diffusion model to compute dense zero-shot reward signals for text-aligned policy learning.

large language model, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Technology: