AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

9fc664916bce863561527f06a96f5ff3-Paper.pdf

Neural Information Processing SystemsAug-22-2025, 00:48:55 GMT

machine learning, natural language, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.46)

Industry:

Education (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
(2 more...)

Add feedback

35fdecdf8861bc15110d48fbec3193cf-Paper-Conference.pdf

Neural Information Processing SystemsAug-22-2025, 00:48:30 GMT

data mining, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan (0.14)
Europe > France (0.14)

Genre: Research Report > New Finding (0.45)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.51)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Data Science > Data Mining > Big Data (0.66)

Add feedback

7ce5da35e01cfa8d303c2dc71e61a470-Paper-Conference.pdf

Neural Information Processing SystemsAug-22-2025, 00:47:34 GMT

demonstration, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)

Add feedback

Reinforcement Learning with Feedback Graphs

Neural Information Processing SystemsAug-22-2025, 00:47:20 GMT

We study RL in the tabular MDP setting where the agent receives additional observations per step in the form of transitions samples.

algorithm, feedback graph, state-action pair, (11 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Add feedback

99bf3d153d4bf67d640051a1af322505-Paper.pdf

Neural Information Processing SystemsAug-22-2025, 00:47:12 GMT

machine learning, natural language, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(11 more...)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

c22abfa379f38b5b0411bc11fa9bf92f-Paper.pdf

Neural Information Processing SystemsAug-22-2025, 00:46:43 GMT

algorithm, experience replay, sgd, (14 more...)

Neural Information Processing Systems

Country:

Asia > India > Karnataka > Bengaluru (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Add feedback

Weighted model estimation for offline model-based reinforcement learning

Neural Information Processing SystemsAug-22-2025, 00:43:57 GMT

This paper discusses model estimation in offline model-based reinforcement learning (MBRL), which is important for subsequent policy improvement using an estimated model.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Supplements of " Non-crossing quantile regression in deep reinforcement learning "

Neural Information Processing SystemsAug-22-2025, 00:42:54 GMT

We first introduce the following Lemma, which is used to complete the proof of Lemma 1. Lemma. Consider an MDP with countable state and action spaces. Therefore, the inequality (4) holds, which completes the proof.Now we give the proof of Lemma 1. Lemma 1. The proof is similar to the argument of that of Proposition 2 of [1]. We assume that instantaneous rewards given a state-action pair are deterministic, and the general case is a straight-forward generalization with the regular probability argument.

lemma 1, quantile, state-action pair, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)

Add feedback

Non-crossing quantile regression for deep reinforcement learning

Neural Information Processing SystemsAug-22-2025, 00:42:47 GMT

Distributional reinforcement learning (DRL) estimates the distribution over future returns instead of the mean to more efficiently capture the intrinsic uncertainty of MDPs.

algorithm, quantile regression, reinforcement, (13 more...)

Neural Information Processing Systems

Country: