AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Genetic-Gated Networks for Deep Reinforcement Learning

Simyung Chang, John Yang, Jaeseok Choi, Nojun Kwak

Neural Information Processing SystemsNov-20-2025, 20:21:57 GMT

Exploiting the short-sighted gradients should be balanced with adequate explorations. Explorations thus should be designed irrelevant to policy gradients in order to guide the policy to unseen states.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.05)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Is Q-Learning Provably Efficient?

Chi Jin, Zeyuan Allen-Zhu, Sebastien Bubeck, Michael I. Jordan

Neural Information Processing SystemsNov-20-2025, 20:18:52 GMT

Model-free reinforcement learning (RL) algorithms, such as Q-learning, directly parameterize and update value functions or policies without explicitly modeling the environment. They are typically simpler, more flexible to use, and thus more prevalent in modern deep RL than model-based approaches. However, empirical work has suggested that model-free algorithms may require more samples to learn [7, 22]. The theoretical question of "whether model-free algorithms can be made sample efficient " is one of the most fundamental questions in RL, and remains unsolved even in the basic scenario with finitely many states and actions.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

The Importance of Sampling inMeta-Reinforcement Learning

Bradly Stadie, Ge Yang, Rein Houthooft, Peter Chen, Yan Duan, Yuhuai Wu, Pieter Abbeel, Ilya Sutskever

Neural Information Processing SystemsNov-20-2025, 20:13:20 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

Learning Abstract Options

Matthew Riemer, Miao Liu, Gerald Tesauro

Neural Information Processing SystemsNov-20-2025, 20:08:27 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > Canada > Quebec > Montreal (0.04)

Industry:

Education (0.46)
Leisure & Entertainment > Games > Computer Games (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search

Yelong Shen, Jianshu Chen, Po-Sen Huang, Yuqing Guo, Jianfeng Gao

Neural Information Processing SystemsNov-20-2025, 19:57:47 GMT

It can be formulated as a reinforcement learning (RL) problem with a known state transition model.

m-walk, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Hawaii (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach

Michael Gimelfarb, Scott Sanner, Chi-Guhn Lee

Neural Information Processing SystemsNov-20-2025, 19:38:05 GMT

Potential-based reward shaping incorporates prior domain knowledge in the form of additional rewards provided during training to speed up convergence of reinforcement learning algorithms, without changing the optimal policies (Ng et al. [1999]).

algorithm, autonomous agent and multiagent system, reinforcement learning, (9 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Hardware Conditioned Policies for Multi-Robot Transfer Learning

Neural Information Processing SystemsNov-20-2025, 19:36:00 GMT

It is also prohibitively expensive to learn a new policy from scratch for each robot hardware due to the high sample complexity of modern state-of-the-art algorithms.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: