AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

ReincarnatingReinforcementLearning: ReusingPriorComputationtoAccelerateProgress

Neural Information Processing SystemsFeb-11-2026, 14:45:48 GMT

The vertical separators correspond to loading network weights and replay buffer for fine-tuning while offline pre-training on replay buffer using QDagger (Section 4.1) for reincarnation. Shaded regions show 95% confidence intervals.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (0.34)
Research Report > Experimental Study (0.34)

Industry:

Education (0.68)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Successor Feature Landmarksfor Long-Horizon Goal-Conditioned Reinforcement Learning

Neural Information Processing SystemsFeb-11-2026, 14:36:14 GMT

Planned Path Graph + SF Update Graph + 4. Use random policy to explore

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Puerto Rico > San Juan > San Juan (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

HindsightCreditAssignment

Neural Information Processing SystemsFeb-11-2026, 14:35:58 GMT

A reinforcement learning (RL) agent is tasked with two fundamental, interdependent problems: exploration(howtodiscoverusefuldata),andcreditassignment(howtoincorporateit). The simplest way of estimating the value function is by averaging returns (futurediscountedsumsofrewards)startingfromtaking ainx.

machine learning, reinforcement learning, trajectory, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

e22dd5dabde45eda5a1a67772c8e25dd-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 14:17:18 GMT

complexity, counterexample, learner, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

4d4a3b6a34332d80349137bcc98164a5-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 14:07:31 GMT

arxiv preprint arxiv, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(3 more...)

Genre:

Workflow (0.67)
Research Report (0.46)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Add feedback

b931c44c35ce09e942edab7003eb3daa-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 14:06:27 GMT

mdp, probability, specification, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Netherlands > Gelderland > Nijmegen (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Berlin (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

4cddc8fc57039f8fe44e23aba1e4df40-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 13:56:45 GMT

attack strategy, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California > Yolo County > Davis (0.14)

Industry:

Information Technology > Security & Privacy (0.52)
Leisure & Entertainment > Games (0.45)
Energy > Energy Storage (0.45)
Government > Military (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

RUDDER: Return Decomposition for Delayed Rewards

Jose A. Arjona-Medina, Michael Gillhofer, Michael Widrich, Thomas Unterthiner, Johannes Brandstetter, Sepp Hochreiter

Neural Information Processing SystemsFeb-11-2026, 13:56:30 GMT

Neural Information Processing Systems http://nips.cc/

redistribution, reward redistribution, rudder, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)
(5 more...)

Industry:

Education (0.46)
Leisure & Entertainment > Games (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

b8bf2c0dd0b48511889b7d3b2c5fc8f5-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 13:55:39 GMT

algorithm, optimal threshold policy, threshold policy, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
Oceania > New Zealand (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (1.00)
Banking & Finance > Economy (0.93)
Transportation > Ground > Road (0.69)
(3 more...)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

e140dbab44e01e699491a59c9978b924-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 13:55:28 GMT

Success stories of deep reinforcement learning (RL) from high dimensional inputs such as pixels or large spatial layouts include achieving superhuman performance on Atari games [30, 37, 1], grandmaster levelinStarcraft II[50]andgrasping adiverse setofobjects with impressivesuccess rates and generalization with robots in the real world [21].

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Computer Games (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback