AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

4fe1859112230a032c7143a9adc3be78-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 22:14:11 GMT

crossover, molecule, reaction, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Materials (0.92)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Add feedback

4fe1859112230a032c7143a9adc3be78-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 22:14:07 GMT

algorithm, crossover, molecule, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Appendix: ContinuousDoublyConstrainedBatch ReinforcementLearning

Neural Information Processing SystemsFeb-8-2026, 21:59:52 GMT

However, numbers for BCQ and SAC are from our runs for all tasks. These plots show that, in the vast majority of environments, CDC exhibits consistently better performance across different seeds/iterations.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

ContinuousDoublyConstrainedBatch ReinforcementLearning

Neural Information Processing SystemsFeb-8-2026, 21:59:48 GMT

Thelimited datainbatchRLproduces inherent uncertainty in value estimates of states/actions that were insufficiently represented in the training data.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

multi

Neural Information Processing SystemsFeb-8-2026, 21:43:39 GMT

Multi-agent reinforcement learning has recently shown great promise as an approach to networked system control. Arguably, one of the most difficult and important tasks for which large scale networked system control is applicable is common-pool resource management.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Greater London > London (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Africa > South Africa > Gauteng > Johannesburg (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn

Neural Information Processing SystemsFeb-8-2026, 21:31:16 GMT

Network outputs can change indirectly to unexpected values after any random batch update for input data not included in the batch, called churn in this paper.

machine learning, reinforcement learning, value and policy, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Portugal > Braga > Braga (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer

Yining Ma, Jingwen Li, Zhiguang Cao, Wen Song, Le Zhang, Zhenghua Chen, Jing Tang

Neural Information Processing SystemsFeb-8-2026, 21:14:44 GMT

Recently routing impr models for representing Dual-Aspect Transformer (DACT) separately potential are embedded cyclic (CPE) Transformer (i.e., cyclic design a to solve problem based impro across dif

dact, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
Asia > Singapore (0.04)

Industry: Transportation (0.97)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

Value Function Decompositionfor Iterative Designof Reinforcement Learning Agents

Neural Information Processing SystemsFeb-8-2026, 20:44:44 GMT

In BW, an include: areforwardprogress, failur ), acostcontr ), ashapingrehead). Require:Experience B; twinQ-function 1, 2 (with parameters 1, 2; policyparameter ; discount ; entrop ; learningrates q, ; targetnetw ; Boolean 1: Sampletransition(s, a, r,0) B.r2Rm is 2: Samplepolica0 ( |s0; )andu ( |s; ) 3: rm+1 log (a0|s0; ).Extend 4: j argmin

machine learning, neural information processing system, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback