AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

News Overviews Instructional Materials AI-Alerts Classics

12bcf58a1c09a0fcb5310f3589291ab4-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 02:04:28 GMT

agent, algorithm, utterance, (13 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.04)
Europe > France (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre:

Research Report > New Finding (1.00)
Personal (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)

NeuralDynamicPolicies forEnd-to-EndSensorimotorLearning

Neural Information Processing SystemsFeb-8-2026, 01:56:59 GMT

The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces such as torque, joint angle, or end-effector position. This forces the agent to make decision at each point in training, and hence, limit the scalability to continuous, high-dimensional,andlong-horizontasks.Incontrast,researchinclassicalrobotics has, for a long time, exploited dynamical systems as a policy representation to learn robot behaviors via demonstrations.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)

Learning Shared Safety Constraints from Multi-task Demonstrations

Neural Information Processing SystemsFeb-8-2026, 01:37:11 GMT

If a friend was in your kitchen and you told them to "make toast" or "clean the dishes," you would

constraint, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

2d4027d6df9c0256b8d4474ce88f8c88-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 01:37:02 GMT

action validity prediction network, construction, information, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

2d4027d6df9c0256b8d4474ce88f8c88-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 01:36:58 GMT

construction, international conference, proceedings, (14 more...)

Neural Information Processing Systems

Country:

Europe > Sweden > Stockholm > Stockholm (0.05)
Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(13 more...)

Genre: Research Report (0.46)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

2a07348a6a7b2c208ab5cb1ee0e78ab5-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 01:26:21 GMT

constraint, cumulative sum cost, probability, (11 more...)

Neural Information Processing Systems

Country:

North America > United States (0.07)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (0.93)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

29e4b51d45dc8f534260adc45b587363-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 01:24:37 GMT

agent, permutation, symmetry, (14 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)

Genre: Research Report (0.68)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

121db870b0470dd63bb5bc59c724275a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 01:14:49 GMT

algorithm, behavior policy, counterfactual decision, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Pennsylvania (0.04)
North America > United States > Montana (0.04)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

2938ad0434a6506b125d8adaff084a4a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 00:57:18 GMT

forward transfer, learning, reinforcement, (16 more...)

Neural Information Processing Systems

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > Poland > Lesser Poland Province > Kraków (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(7 more...)

Genre:

Research Report (0.46)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

0fe6a18be9491139fb759e2f645374b1-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 00:55:24 GMT

complexity, mdp, optimal policy, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.92)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
(2 more...)