AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

e985dfca10e1167c0836a70880ef0858-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 18:03:32 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(9 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Data Science (0.67)
(3 more...)

Add feedback

A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes

Neural Information Processing SystemsFeb-17-2026, 18:03:01 GMT

The proximal policy optimization (PPO) algorithm stands as one of the most prosperous methods in the field of reinforcement learning (RL). Despite its success, the theoretical understanding of PPO remains deficient. Specifically, it is unclear whether PPO or its optimistic variants can effectively solve linear Markov decision processes (MDPs), which are arguably the simplest models in RL with function approximation.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback

b91ea2ba1b6521940e204b7b81d5ef47-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-17-2026, 17:49:14 GMT

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Austria > Styria > Graz (0.04)
North America > United States > Indiana > Marion County > Indianapolis (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.68)

Industry:

Transportation > Ground > Road (1.00)
Leisure & Entertainment > Sports > Motorsports (1.00)
Automobiles & Trucks (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Effectively Learning Initiation Sets in Hierarchical Reinforcement Learning

Neural Information Processing SystemsFeb-17-2026, 17:47:52 GMT

Sutton et al., 2022], is elegantly captured using the options framework [Sutton et al., 1999].

initiation, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Cross-Domain Policy Adaptation via V alue-Guided Data Filtering

Neural Information Processing SystemsFeb-17-2026, 17:47:12 GMT

Part of this work was done during Kang Xu's internship at Shanghai Artificial Intelligence Laboratory.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.24)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

e8916198466e8ef218a2185a491b49fa-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-17-2026, 17:46:52 GMT

library, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec (0.05)
North America > United States > Maryland > Baltimore (0.05)
Europe > Sweden > Skåne County > Malmö (0.04)
(6 more...)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Successor-Predecessor Intrinsic Exploration Changmin Y u 1,2 Neil Burgess

Neural Information Processing SystemsFeb-17-2026, 17:12:30 GMT

Exploration is essential in reinforcement learning, particularly in environments where external rewards are sparse. Here we focus on exploration with intrinsic rewards, where the agent transiently augments the external rewards with self-generated intrinsic rewards.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: