AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

3d3d286a8d153a4a58156d0e02d8570c-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 07:44:19 GMT

dataset, ood action, q-value, (13 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > Netherlands > South Holland > Delft (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.57)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

OntheConvergenceofSmoothRegularized ApproximateValueIterationSchemes

Neural Information Processing SystemsFeb-8-2026, 07:36:06 GMT

In practical settings, the reinforcement learning (RL) algorithms are faced with a challenge of maximizing the cumulative reward given a finite sample of environment transitions and inexact representation ofpolicyandvaluefunction. This givesrisetoerrors thatpropagateacross learning iterations and, combined, can result in divergence. Recently, state-of-the-art RL algorithms have been successful in solving complex environments and, hence, overcoming inaccuracies and their accumulation.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes

Neural Information Processing SystemsFeb-8-2026, 07:07:26 GMT

Humans can quickly learn new tasks from just a handful of examples by preserving rich representations of experience [1].

dynamic model, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Instructional Material > Online (0.40)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

477bdb55b231264bb53a7942fd84254d-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 07:06:55 GMT

apprenticeship, demonstration, demonstrator, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.96)

Industry:

Health & Medicine (0.93)
Transportation > Passenger (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.46)

Add feedback

SupplementaryMaterials AProofofTheorem2: AsymptoticConvergenceofRobustQ-Learning

Neural Information Processing SystemsFeb-8-2026, 06:47:18 GMT

From[BorkarandMeyn,2000],weknowthatthestochastic approximation (18) converges to the fixed point ofT, i.e., Q . Finally, to show Theorem 3, we only need to show each term in(56) is smaller than . In this section we develop the finite-time analysis of the robust TDC algorithm. We note that recently there are several works [Srikant and Ying, 2019, Xu and Liang, 2021, Kaledin et al., 2020] on finite-time analysis of RL algorithms that do not need theprojection. Specifically, the problem in [Srikant and Ying, 2019] is for one time scalelinear stochastic approximation.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

OnlineRobustReinforcementLearningwithModel Uncertainty

Neural Information Processing SystemsFeb-8-2026, 06:47:15 GMT

Robust reinforcement learning (RL) is to find a policy that optimizes the worstcase performance over an uncertainty set of MDPs. In this paper, we focus on model-freerobust RL, where the uncertainty set is defined to be centering at a misspecified MDP that generates a single sample trajectory sequentially, and is assumed to beunknown.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology: