AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Convergent Policy Optimization for Safe Reinforcement Learning

Ming Yu, Zhuoran Yang, Mladen Kolar, Zhaoran Wang

Neural Information Processing SystemsFeb-14-2026, 13:53:57 GMT

Given ,J ( )andD ( )arethesample (i.e., atrajectory) . Note J ( ) and D ( ) are randomness J ( )andD ( )todenote anda ClearlyweJ( )= E J ( ) andD( )= E D ( ) .

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Pennsylvania (0.04)
(5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)

Add feedback

Unsupervised Learning of Object Keypoints for Perception and Control

Tejas D. Kulkarni, Ankush Gupta, Catalin Ionescu, Sebastian Borgeaud, Malcolm Reynolds, Andrew Zisserman, Volodymyr Mnih

Neural Information Processing SystemsFeb-14-2026, 13:29:38 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry:

Leisure & Entertainment > Games (0.71)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

5c186016d0844767209dc36e9e61441b-Paper-Conference.pdf

Neural Information Processing SystemsFeb-14-2026, 12:41:35 GMT

DeMa's focus on sequences diminishes approximately exponentially.

machine learning, natural language, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Zhejiang Province (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment > Games (0.47)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
(2 more...)

Add feedback

Doubly Mild Generalization for Offline Reinforcement Learning Yixiu Mao 1, Qi Wang 1, Y un Qu

Neural Information Processing SystemsFeb-14-2026, 12:29:01 GMT

Offline Reinforcement Learning (RL) suffers from the extrapolation error and value overestimation. From a generalization perspective, this issue can be attributed to the over-generalization of value functions or policies towards out-of-distribution (OOD) actions.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting > Online (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

5b76d77e7095c6480ed827b85f0c2878-Paper-Conference.pdf

Neural Information Processing SystemsFeb-14-2026, 10:31:35 GMT

experiment, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education (0.94)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Doubly-Robust Lasso Bandit

Gi-Soo Kim, Myunghee Cho Paik

Neural Information Processing SystemsFeb-14-2026, 10:07:11 GMT

While therewardcompensation mechanism isunknown,the learner can adapt his (her) decision to the past reward feedback so as to maximize the sum of rewards.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

Unsupervised Curricula for Visual Meta-Reinforcement Learning

Allan Jabri, Kyle Hsu, Abhishek Gupta, Ben Eysenbach, Sergey Levine, Chelsea Finn

Neural Information Processing SystemsFeb-14-2026, 10:05:30 GMT

Neural Information Processing Systems http://nips.cc/

exploration, international conference, task distribution, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre:

Research Report (0.46)
Instructional Material > Course Syllabus & Notes (0.40)

Industry: Education (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities

Neural Information Processing SystemsFeb-14-2026, 09:21:31 GMT

In fact, the interaction of these two aspects requires addressing the fact that each agent's own safety constraint requires information from all others.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre:

Research Report > New Finding (0.67)
Overview (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities

Neural Information Processing SystemsFeb-14-2026, 09:21:27 GMT

In fact, the interaction of these two aspects requires addressing the fact that each agent's own safety constraint requires information from all others.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country: