AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Large-ScaleRetrievalforReinforcementLearning

Neural Information Processing SystemsFeb-10-2026, 04:46:02 GMT

Thisallows agents to directly learn in an end-to-end manner to utilise relevant information to inform their outputs. In addition, new information can be attended to by the agent, without retraining, by simply augmenting the retrieval dataset.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.47)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Large-ScaleRetrievalforReinforcementLearning

Neural Information Processing SystemsFeb-10-2026, 04:45:58 GMT

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Add feedback

2bed6c14cd5ea97a9bc1e6094941bde7-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 04:44:22 GMT

diffusion model, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Add feedback

7e9fbd01b3084956dd8a070c7bf30bad-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 04:44:12 GMT

causality, hierarchical structure, subgoal, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Neural Information Processing SystemsFeb-10-2026, 03:21:04 GMT

Work done as a visiting student at MIT. 38th Conference on Neural Information Processing Systems (NeurIPS 2024).

large language model, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)

Genre: Research Report > Experimental Study (0.92)

Industry:

Energy (0.67)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

Finite-TimeAnalysisofAdaptiveTemporalDifference LearningwithDeepNeuralNetworks

Neural Information Processing SystemsFeb-10-2026, 02:59:42 GMT

Nevertheless, theperformance guarantee of adaptive TD with neural network approximation remains widely unknown.

approximation, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.05)
North America > United States (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)

Add feedback

9724412729185d53a2e3e7f889d9f057-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 02:59:25 GMT

algorithm, chemical accuracy, threshold, (13 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > South Holland > Leiden (0.05)
Europe > Poland > Masovia Province > Warsaw (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Constrainedepisodicreinforcementlearningin concave-convexandknapsacksettings

Neural Information Processing SystemsFeb-10-2026, 02:57:31 GMT

Our approach relies on the principle ofoptimism under uncertaintyto efficiently explore. Our learning algorithms optimizetheiractions withrespect toamodel based ontheempirical statistics, while optimistically overestimating rewards and underestimating the resource consumption (i.e., overestimating the distance from the constraint).

constraint, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: