AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Uniform-PACBoundsforReinforcementLearning withLinearFunctionApproximation

Neural Information Processing SystemsFeb-9-2026, 10:14:58 GMT

Designing efficient reinforcement learning (RL) algorithms for environments with large state and action spaces is one of the main tasks in the RL community.

akih, machine learning, skih, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > United Kingdom > England (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

7695ea769f021803c508817dd374bb27-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 10:14:55 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

62b4fea131cfd5b7504eae356b75bbd8-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 10:13:22 GMT

algorithm, arxiv preprint arxiv, markov game, (11 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)

Add feedback

UnpackingRewardShaping

Neural Information Processing SystemsFeb-9-2026, 09:55:56 GMT

Much of this work is based on upper confidence bound (UCB) principles and prescribes some kind of exploration bonus to prioritize exploration of rarely visited regions.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)

Add feedback

6255f22349da5f2126dfc0b007075450-Paper-Conference.pdf

Simon Zhai

Neural Information Processing SystemsFeb-9-2026, 09:55:53 GMT

reinforcement learning, state action pair, ucbvi-shaped, (11 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.98)
Information Technology > Data Science (0.93)

Add feedback

Distributional Policy Evaluation: a Maximum Entropy approach to Representation Learning

Neural Information Processing SystemsFeb-9-2026, 09:53:29 GMT

In Distributional Reinforcement Learning (D-RL) [Bellemare et al., 2023], an agent aims to estimate Sutton and Barto, 2018], where the objective is to predict the expected return only. In Section 3, we answer this methodological question, showing that it is possible to reformulate Policy Evaluation in a distributional setting so that its performance index is explicitly intertwined with the representation of the (state or action) spaces.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Lombardy > Milan (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.42)

Add feedback

PolicyLearningfromTutorialBooks via Understanding, Rehearsingand Introspecting

Neural Information Processing SystemsFeb-9-2026, 09:43:58 GMT

Inthemuch more complex football game, URI's policy beat the built-in AIs with a 37% winning rate while GPT-based agents can only achieve a 6% winning rate.

large language model, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: