AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

31f119089f702e48ecfd138c1bc82c4a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 20:17:18 GMT

large language model, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Add feedback

b5488aeff42889188d03c9895255cecc-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 20:13:26 GMT

iteration, learner, teaching, (15 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Instructional Material (1.00)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Mismatched NoMore: JointModel-PolicyOptimizationforModel-BasedRL

Neural Information Processing SystemsFeb-10-2026, 20:11:23 GMT

A version ofthis bound becomes tight under certain assumptions.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Add feedback

9316769afaaeeaad42a9e3633b14e801-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 20:00:19 GMT

iso-dream, prediction, world model, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
(2 more...)

Add feedback

e1696007be4eefb81b1a1d39ce48681b-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 19:41:58 GMT

In this work, we identify anovel set of conditions that ensure convergence with probability 1 ofQ-learning with linear function approximation, by proposing a twotime-scalevariationthereof.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

31888563b194f9bb33ce1aebc7e1551c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 19:29:28 GMT

machine learning, natural language, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Hong Kong (0.04)
South America > Brazil > São Paulo (0.04)
(4 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Industry:

Information Technology (0.92)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(4 more...)

Add feedback

ABi-LevelFrameworkforLearningtoSolve CombinatorialOptimizationonGraphs

Neural Information Processing SystemsFeb-10-2026, 19:21:02 GMT

However, achieving such an assumption is non-trivial, leading to the following two aspects of challenges. On the one hand, it is challenging to design a model with enough capacity with limited computational resources, andexisting models areusually tailored forspecific problems which require heavytrailand-error [25,57,59].

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > Italy (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)

Add feedback

b2eeb7362ef83deff5c7813a67e14f0a-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 19:20:26 GMT

algorithm, sample complexity, theorem 2, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Finite-SampleAnalysisofOff-PolicyTD-Learningvia GeneralizedBellmanOperators

Neural Information Processing SystemsFeb-10-2026, 19:20:21 GMT

Itisknown that policyevaluation has the interpretation of solving ageneralized Bellman equation. Inthispaper,wederivefinite-sample bounds foranygeneral off-policy TD-like stochastic approximation algorithm that solves for the fixedpoint of this generalized Bellman operator.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Add feedback