AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Playing hard exploration games by watching YouTube

Yusuf Aytar, Tobias Pfaff, David Budden, Thomas Paine, Ziyu Wang, Nando de Freitas

Neural Information Processing SystemsFeb-12-2026, 14:51:18 GMT

Deep reinforcement learning methods traditionally struggle withtaskswhere environment rewards are particularly sparse.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Industry: Leisure & Entertainment (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Large Language Models Can Implement Policy Iteration Ethan Brooks 1, Logan Walls 2, Richard L. Lewis

Neural Information Processing SystemsFeb-12-2026, 14:42:57 GMT

Gradient techniques are inherently slow, sacrificing the "few-shot" quality

large language model, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report (0.70)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Add feedback

Constrained Cross-Entropy Method for Safe Reinforcement Learning

Min Wen, Ufuk Topcu

Neural Information Processing SystemsFeb-12-2026, 14:42:21 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, constraint, international conference, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Pennsylvania (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

e95475f5fb8edb9075bf9e25670d4013-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 14:42:06 GMT

learning, noise level, policy gradient, (12 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)

Add feedback

e92381dba235a8309f08ce46376189a9-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 14:33:22 GMT

The transition dynamics simply mixes an action and a random sampled latent. It then applies an exponential moving average for temporal persistency, the resulting latent is decoded to image using pretrained generator.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: