AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

LearningDynamicBeliefGraphstoGeneralize onText-BasedGames

Neural Information Processing SystemsFeb-7-2026, 18:14:20 GMT

GATAis trained using acombination of reinforcement and self-supervised learning. Our workdemonstrates thatthelearned graph-based representations helpagents converge to better policies than their text-only counterparts and facilitate effective generalization across game configurations.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Quebec > Montreal (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

FederatedEnsemble-Directed OfflineReinforcementLearning

Neural Information Processing SystemsFeb-7-2026, 18:05:03 GMT

We consider the problem of federated offline reinforcement learning (RL), a scenario under which distributed learning agents must collaboratively learn a high-quality control policyonly using small pre-collected datasets generated according to different unknown behavior policies. Naïvely combining a standard offline RL approach with a standard federated learning approach to solve this problem can lead to poorly performing policies. In response, we develop the Federated Ensemble-Directed Offline Reinforcement Learning Algorithm (FEDORA), which distills the collective wisdom of the clients using an ensemble learning approach. We develop the FEDORA codebase to utilize distributed compute resources on a federated learning platform. We show that FEDORA significantly outperforms other approaches, including offline RL over the combined data pool, in various complex continuous control environments and realworld datasets.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Virginia (0.04)

Industry: Energy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

1cd73be1e256a7405516501e94e892ac-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 17:46:25 GMT

arxiv preprint arxiv, exploitability, oracle, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Orange County > Irvine (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

1cbcaa5abbb6b70f378a3a03d0c26386-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 17:44:41 GMT

cil task, learning, policy function, (17 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Saarland (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)
Asia > Singapore (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)

Genre: Research Report (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Add feedback

1a0755b249b772ed5529796b0a7cc9bd-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 17:44:10 GMT

dataset, learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Beijing > Beijing (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

1bf2efbbe0c49b9f567c2e40f645279a-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 17:25:43 GMT

inequality, log null 6, probability, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Workflow (0.67)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.92)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

1bf2efbbe0c49b9f567c2e40f645279a-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 17:25:39 GMT

algorithm, learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Model-based Policy Optimization with Unsupervised Model Adaptation

Neural Information Processing SystemsFeb-7-2026, 17:16:04 GMT

In recent years, model-free reinforcement learning (MFRL) has achieved tremendous success on a wide range of simulated domains, e .

artificial intelligence, machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)

Add feedback

18ddfb199d71a8a24f83abc1ced077b7-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 17:04:37 GMT

agent, learning, representation, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > Canada > British Columbia > Vancouver (0.04)
Europe > Austria (0.04)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

MinimaxValueIntervalforOff-PolicyEvaluation andPolicyOptimization

Neural Information Processing SystemsFeb-7-2026, 17:04:19 GMT

FunctionApproximation Throughout thepaper,weassume access totwofunction classesQ (S A R)andW (S A R). Todevelop intuition, theyare supposed to modelQπ and wπ/µ, respectively, though most of our main results are stated without assuming any kind of realizability.

lbw, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback