AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

9c7900fac04a701cbed83256b76dbaa3-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 23:36:52 GMT

machine learning, reinforcement learning, trajectory, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(4 more...)

Add feedback

A State Representation for Diminishing Rewards

Neural Information Processing SystemsOct-10-2025, 23:29:32 GMT

In such situations, the successor representation (SR) is a popular framework which supports rapid policy evaluation by decoupling a policy's

data mining, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Portugal > Porto > Porto (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.92)

Add feedback

83dc5747870ea454cab25e30bef4eb8a-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 23:28:42 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
South America > Brazil (0.04)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Large Language Models Can Implement Policy Iteration Ethan Brooks 1, Logan Walls 2, Richard L. Lewis

Neural Information Processing SystemsOct-10-2025, 23:19:27 GMT

Gradient techniques are inherently slow, sacrificing the "few-shot" quality

large language model, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Michigan (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report (0.70)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Add feedback

57587d8d6a7ede0e5302fc22d0878c53-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 23:18:43 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.05)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China (0.04)

Genre:

Research Report > New Finding (0.92)
Workflow (0.67)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.45)

Add feedback

Overleaf Example

Neural Information Processing SystemsOct-10-2025, 22:50:01 GMT

Second, we enforce the posterior update model to learn the dynamics of the latent variable.

dlcmdp, dynamite-rl, latent context, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment (0.67)
Media (0.67)
Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

fe7f375ef01e43f17d2c32b28a176577-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 22:37:49 GMT

experiment, optimality, probability, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.28)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

An Offline Adaptation Framework for Constrained Multi-Objective Reinforcement Learning

Neural Information Processing SystemsOct-10-2025, 22:29:30 GMT

In the standard reinforcement learning (RL) setting, the primary goal is to obtain a policy that maximizes a cumulative scalar reward [Sutton and Barto, 2018].

dataset, demonstration, target preference, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Safety through feedback in Constrained RL

Neural Information Processing SystemsOct-10-2025, 22:22:03 GMT

This feedback can be system generated or elicited from a human observing the training process. Previous approaches have not been able to scale to complex environments and are constrained to receiving feedback at the state level which can be expensive to collect. To this end, we introduce an approach that scales to more complex domains and extends beyond state-level feedback, thus, reducing the burden on the evaluator.

agent, cost function, trajectory, (13 more...)

Neural Information Processing Systems

Country: