AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates

Carlos Riquelme, Hugo Penedones, Damien Vincent, Hartmut Maennel, Sylvain Gelly, Timothy A. Mann, Andre Barreto, Gergely Neu

Neural Information Processing SystemsOct-3-2025, 02:46:15 GMT

In reinforcement learning (RL) an agent must learn how to behave while interacting with an environment. This challenging problem is usually formalized as the search for a decision policy-- i.e.,a

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

65ae450c5536606c266f49f1c08321f2-Paper.pdf

Neural Information Processing SystemsOct-3-2025, 02:43:27 GMT

machine learning, reinforcement learning, trajectory, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Government > Military (0.69)
Government > Regional Government (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.46)

Add feedback

27b51baca8377a0cf109f6ecc15a0f70-Paper-Conference.pdf

Neural Information Processing SystemsOct-3-2025, 02:23:47 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.29)

Genre: Research Report (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

SHAQ: Incorporating Shapley Value Theory into Multi-Agent Q-Learning

Neural Information Processing SystemsOct-3-2025, 02:21:21 GMT

V alue factorisation is a useful technique for multi-agent reinforcement learning (MARL) in global reward game, however, its underlying mechanism is not yet fully understood. This paper studies a theoretical framework for value factorisation with interpretability via Shapley value theory.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.67)
Europe > United Kingdom > England (0.27)

Genre: Research Report (1.00)

Industry:

Government (0.67)
Information Technology (0.67)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

27985d21f0b751b933d675930aa25022-Paper-Conference.pdf

Neural Information Processing SystemsOct-3-2025, 02:21:17 GMT

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.98)
(2 more...)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-3-2025, 02:18:15 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Summary: This paper provides a Bayesian expected regret bound for the Posterior Sampling for the Reinforcement Learning (PSRL) algorithm. PSRL has been introduced by [Strens2000], and can be seen as the application of Thompson sampling for RL problems: a model is sampled from the (posterior) distribution over models, the optimal policy for the sampled model is calculated, the policy is followed until the end of the horizon, and the distribution over models is updated. PSRL for finite MDPs has been analyzed by [OVRR2013], but the main contribution of this paper is to analyze PSRL for MDPs with general state and action space. In the analysis, the authors use the concept of eluder dimension introduced by [RVR2013]. Eluder dimension was previously used in the analysis of bandit problems (for both Thompson Sampling and the Optimism in Face of Uncertainty (OFU) approaches).

algorithm, dimension, eluder dimension, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology: