AITopics | Reinforcement Learning

Whento Askfor Help: Proactive Interventionsin Autonomous Reinforcement Learning

Neural Information Processing SystemsFeb-9-2026, 15:16:25 GMT

Wheneverthe 6 Figure 4:Asubsetof Irreversible (left) andoffofthemiddle). Asafe estimates 7 ourevaluationtasks: Tabletop Manipulation, Peg Insertion, and Half-Cheetah Velocity.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

83fa5a432ae55c253d0e60dbfa716723-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 15:16:16 GMT

amortization, international conference, optimization, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

8396b14c5dff55d13eea57487bf8ed26-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 15:06:39 GMT

evaluation, inequality, mdp, (10 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Nearly Horizon-Free Offline Reinforcement Learning

Neural Information Processing SystemsFeb-9-2026, 15:06:35 GMT

A (potentially is =( 1, 2, H), where h : S ! ItholdsVh(s)depends ˆP(s0|s, a), ho S factor.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)

Add feedback

a18aa23ee676d7f5ffb34cf16df3e08c-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-9-2026, 15:05:27 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.37)

Add feedback

6af779991368999ab3da0d366c208fba-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 14:56:06 GMT

Planning enables autonomous agents to solve complex decision-making problems by evaluating predictions of the future. However, classical planning algorithms often become infeasible in real-world settings where state spaces are high-dimensional andtransitiondynamicsunknown.

curranassociate, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Germany (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.35)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)

Add feedback

6aef8bffb372096ee73d98da30119f89-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 14:55:52 GMT

constraint, reinforcement, si-crl, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Ukraine > Kharkiv Oblast > Kharkiv (0.04)
(2 more...)

Genre: Research Report (0.68)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.51)

Add feedback

Self-ImitationLearningviaGeneralizedLower BoundQ-learning

Neural Information Processing SystemsFeb-9-2026, 14:54:58 GMT

NaiveIS estimator involves products of the form π(at | xt)/µ(at | xt) and is infeasible in practice due to high variance. To control the variance, a line of prior work has focused on operator-based estimation to avoid fullIS products, which reduces the estimation procedure into repeated iterations of off-policyevaluation operators [1-3].

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback