AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

AdaptableAgentPopulations viaaGenerativeModelofPolicies

Neural Information Processing SystemsFeb-7-2026, 19:14:02 GMT

In reinforcement learning (RL), it is common to learn a single policy that fits an environment.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

0c946accd3ccc88c09dfae7e1cd40ffe-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-7-2026, 19:13:45 GMT

agent, dataset, representation, (14 more...)

Neural Information Processing Systems

Country: Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

0cfc9404f89400c5ed897035e0d3748c-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 19:05:34 GMT

algorithm, arxiv preprint arxiv, probability, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.41)

Add feedback

Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy Concentrability

Neural Information Processing SystemsFeb-7-2026, 19:05:30 GMT

Goal-conditioned reinforcement learning (GCRL) refers to learning general-purpose skills that aim to reach diverse goals.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.41)

Add feedback

216f44e2d28d4e175a194492bde9148f-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 19:03:33 GMT

We assume the environment modeled as discrete-time factored-action MDP (FA-MDP)M = hS,A,P,R,γi where S is the set of states s, A is the set of vector-represented actionsa = (a1,...,am),P(s0|s,a) = Pr(st+1 = s0|st = s,at = a)isthe transition probability,R(s,a) R is the immediate reward for taking actiona in state s, and γ [0,1) is the discount factor.

action persistence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

InformationDirectedRewardLearning forReinforcementLearning

Neural Information Processing SystemsFeb-7-2026, 19:03:16 GMT

From such expensive feedback, we aim to learn a model of the reward that allows standard RL algorithms to achieve high expected returnswith as few expert queries as possible.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: Europe > Austria > Vienna (0.04)

Technology: