AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Context-dependent upper-confidence bounds for directed exploration

Raksha Kumaraswamy, Matthew Schlegel, Adam White, Martha White

Neural Information Processing SystemsFeb-15-2026, 06:43:55 GMT

Second, we t = rt+1+ t+1x>t+1w x>t w , TD-errorforw (see (2)). This t is tobelarger t.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Alberta (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

Multi-Agent Common Knowledge Reinforcement Learning

Christian Schroeder de Witt, Jakob Foerster, Gregory Farquhar, Philip Torr, Wendelin Boehmer, Shimon Whiteson

Neural Information Processing SystemsFeb-15-2026, 05:56:21 GMT

Figure 3: Gamematrices A (top) and B (bottom) [left]. Allexperimentsuse SMACsettingsforcomparability (see Samvelyanetal. (2019) and Appendix Bfordetails).

machine learning, mackrl, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Africa > Sudan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Recovering from Out-of-sample States via Inverse Dynamics in Offline Reinforcement Learning

Neural Information Processing SystemsFeb-15-2026, 05:31:14 GMT

However, such pessimism for out-of-sample data could be too restricted and sample inefficient, as not all out-of-sample(unseen) states are not generalizable [20].

inverse dynamic model, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

7a0f7e9d9b42b26e5bfc9ba4c6e5287c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 05:31:10 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
(7 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.52)

Add feedback

Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing

Zehong Hu, Yitao Liang, Jie Zhang, Zhao Li, Yang Liu

Neural Information Processing SystemsFeb-15-2026, 05:30:05 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, bayesian inference algorithm, inference algorithm, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.51)

Add feedback

Efficient Potential-based Exploration in Reinforcement Learning using Inverse Dynamic Bisimulation Metric

Neural Information Processing SystemsFeb-15-2026, 05:07:01 GMT

While a number of RL methods have been proposed to boost exploration by designing an intrinsic reward signal as exploration bonus.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Macao (0.14)
Asia > China > Zhejiang Province > Hangzhou (0.04)
Asia > China > Hong Kong (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)

Add feedback

Diffusion-based ReinforcementLearningvia Q-weightedVariationalPolicyOptimization

Neural Information Processing SystemsFeb-15-2026, 04:47:43 GMT

UnlikeGaussian policies, the log-likelihood indiffusion policies isinaccessible; thus this entropy term is nontrivial. Moreover, to reduce the large variance of diffusion policies, we also develop an efficient behavior policy through action selection. This can further improve its sample efficiency during online interaction.

justification, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback