AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

fe51de4e7baf52e743b679e3bdba7905-Paper-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 03:53:01 GMT

data distribution, dataset, generator, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report (0.69)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

6f2268bd1d3d3ebaabb04d6b5d099425-Paper.pdf

Neural Information Processing SystemsFeb-13-2026, 03:52:47 GMT

demonstration, internal dynamic model, real dynamic, (9 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment (0.68)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Regret Bounds for Learning State Representations in Reinforcement Learning

Ronald Ortner, Matteo Pirotta, Alessandro Lazaric, Ronan Fruit, Odalric-Ambrym Maillard

Neural Information Processing SystemsFeb-13-2026, 03:37:14 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, markov model, representation, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada (0.04)
Europe > France > Hauts-de-France > Pas-de-Calais (0.04)
Europe > Austria > Styria > Leoben (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

State Regularized Policy Optimization on Data with Dynamics Shift

Neural Information Processing SystemsFeb-13-2026, 03:27:31 GMT

We then demonstrate a lower-bound performance guarantee on policies regularized by the stationary state distribution. In practice, SRPO can be an add-on module to context-based algorithms in both online and offline RL settings.

machine learning, reinforcement learning, state distribution, (15 more...)

Neural Information Processing Systems

Country: Asia > Singapore (0.05)

Genre: Research Report (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)

Add feedback

Importance Resamplingfor Off-policy Prediction

Neural Information Processing SystemsFeb-13-2026, 03:27:16 GMT

Thoughunbiased, IScanbehigh-variance. Alowervariancealternativeis Weighted IS (WIS). Figure 4: Learning Ratesensitivityplotsinthe Random Walk Markov Chain, withbuffersizen = 15000 andmini-batchsizek = 16.

artificial intelligence, machine learning, reinforcement learning, (10 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.06)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

6aed000af86a084f9cb0264161e29dd3-Paper.pdf

Neural Information Processing SystemsFeb-13-2026, 03:02:07 GMT

algorithm, trajectory, variance, (14 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Lombardy > Milan (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Neural Temporal-Difference Learning Converges to Global Optima

Qi Cai, Zhuoran Yang, Jason D. Lee, Zhaoran Wang

Neural Information Processing SystemsFeb-13-2026, 02:18:02 GMT

TD converges at a sublinear rate to the global optimum of the mean-squared projected Bellman error for policy evaluation.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

fbd8e65962da06f83f3f28b52774ffd0-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 01:55:19 GMT

agent, obstacle, point maze, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.41)

Add feedback

ASPiRe: AdaptiveSkillPriorsforReinforcementLearning

Neural Information Processing SystemsFeb-13-2026, 01:55:15 GMT

Transferring prior experience to new tasks is central to an agent's adaptability. In this work, we aim to accelerate online reinforcement learning by leveraging prior experience from large offline data.

aspire, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Instructional Material (0.34)
Research Report (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

Distributional Reward Decomposition for Reinforcement Learning

Zichuan Lin, Li Zhao, Derek Yang, Tao Qin, Tie-Yan Liu, Guangwen Yang

Neural Information Processing SystemsFeb-13-2026, 01:16:51 GMT

Van Seijen et al. [2017] propose to split a state into different sub-states, each with a sub-reward obtained bytraining ageneral valuefunction, andlearnmultiple valuefunctions withsub-rewards. The architecture is rather limited due to requiring prior knowledge of how to split into sub-states.

machine learning, reinforcement, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > China > Shandong Province > Qingdao (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback