AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Forethought_and_Hindsight_in_Credit_Assignment__Camera_Ready_ (3).pdf

Neural Information Processing SystemsOct-2-2025, 06:35:31 GMT

Credit assignment, i.e. determining how to correctly associate delayed rewards with states or state-action pairs, is a crucial problem for reinforcement learning (RL) agents ( Sutton and Barto, 2018).

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

17b3c7061788dbe82de5abe9f6fe22b3-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 06:26:15 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report (0.69)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Hindsight Credit Assignment

Neural Information Processing SystemsOct-2-2025, 06:18:01 GMT

We consider the problem of efficient credit assignment in reinforcement learning.

credit assignment, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

172ef5a94b4dd0aa120c6878fc29f70c-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 06:01:52 GMT

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country: North America (0.28)

Industry:

Leisure & Entertainment > Games (0.93)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

172ef5a94b4dd0aa120c6878fc29f70c-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 06:01:40 GMT

We thank all reviewers for their valuable feedback. We believe our results make a significant contribution to the field of theoretical reinforcement learning. Therefore, analyzing a variant of Nash Q-learning may be of independent interest. Since NE always exists, CCE always exists, i.e., the set of linear constraints are always feasible. The "hat" version is the actual certified policy (which can be executed as in Algorithm 2 and 4).

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Add feedback

Inverse Reinforcement Learning with Locally Consistent Reward Functions

Quoc Phong Nguyen, Bryan Kian Hsiang Low, Patrick Jaillet

Neural Information Processing SystemsOct-2-2025, 05:41:53 GMT

Existing inverse reinforcement learning (IRL) algorithms have assumed each expert's demonstrated trajectory to be produced by only a single reward function. This paper presents a novel generalization of the IRL problem that allows each trajectory to be generated by multiple locally consistent reward functions, hence catering to more realistic and complex experts' behaviors. Solving our generalized IRL problem thus involves not only learning these reward functions but also the stochastic transitions between them at any state (including unvisited states). By representing our IRL problem with a probabilistic graphical model, an expectation-maximization (EM) algorithm can be devised to iteratively learn the different reward functions and the stochastic transitions between them in order to jointly improve the likelihood of the expert's demonstrated trajectories. As a result, the most likely partition of a trajectory into segments that are generated from different locally consistent reward functions selected by EM can be derived. Empirical evaluation on synthetic and real-world datasets shows that our IRL algorithm outperforms the state-of-the-art EM clustering with maximum likelihood IRL, which is, interestingly, a reduced variant of our approach.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Industry: Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

168efc366c449fab9c2843e9b54e2a18-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 05:32:06 GMT

exponential decay property, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report (0.46)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Communications > Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

RUDDER: Return Decomposition for Delayed Rewards

Jose A. Arjona-Medina, Michael Gillhofer, Michael Widrich, Thomas Unterthiner, Johannes Brandstetter, Sepp Hochreiter

Neural Information Processing SystemsOct-2-2025, 05:30:57 GMT

We propose RUDDER, a novel reinforcement learning approach for delayed rewards in finite Markov decision processes (MDPs).

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe (0.46)

Industry:

Education (0.46)
Leisure & Entertainment > Games (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Learning Affordance Landscapes for Interaction Exploration in 3D Environments

Neural Information Processing SystemsOct-2-2025, 05:10:36 GMT

Embodied agents operating in human spaces must be able to master how their environment works: what objects can the agent use, and how can it use them? We introduce a reinforcement learning approach for exploration for interaction, whereby an embodied agent autonomously discovers the affordance landscape of a new unmapped 3D environment (such as an unfamiliar kitchen).

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.46)

Technology: