AITopics | rtdp

25caef3a545a1fff2ff4055484f0e758-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-19-2026, 10:54:53 GMT

algorithm, final version, reviewer, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

TightRegretBoundsforModel-Based Reinforcement LearningwithGreedyPolicies

Neural Information Processing SystemsFeb-11-2026, 17:45:22 GMT

The results are based on anovelanalysis ofreal-time dynamic programming, thenextended tomodel-based RL.Specifically,wegeneralize existing algorithms that perform full-planning to act by 1-step planning.

artificial intelligence, machine learning, skt, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel (0.05)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)

Add feedback

a18aa23ee676d7f5ffb34cf16df3e08c-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 15:05:45 GMT

algorithm, relation hold, value update, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.67)

Add feedback

a18aa23ee676d7f5ffb34cf16df3e08c-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 15:05:37 GMT

Real Time Dynamic Programming (RTDP) is an online algorithm based on Dynamic Programming (DP) that acts by 1-step greedy planning.

artificial intelligence, machine learning, skt, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Israel (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Add feedback

a18aa23ee676d7f5ffb34cf16df3e08c-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-9-2026, 15:05:27 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.37)

Add feedback

Online Planning with Lookahead Policies

Neural Information Processing SystemsDec-24-2025, 09:33:11 GMT

Real Time Dynamic Programming (RTDP) is an online algorithm based on Dynamic Programming (DP) that acts by 1-step greedy planning. Unlike DP, RTDP does not require access to the entire state space, i.e., it explicitly handles the exploration. This fact makes RTDP particularly appealing when the state space is large and it is not possible to update all states simultaneously. In this we devise a multi-step greedy RTDP algorithm, which we call $h$-RTDP, that replaces the 1-step greedy policy with a $h$-step lookahead policy. We analyze $h$-RTDP in its exact form and establish that increasing the lookahead horizon, $h$, results in an improved sample complexity, with the cost of additional computations. This is the first work that proves improved sample complexity as a result of {\em increasing} the lookahead horizon in online planning. We then analyze the performance of $h$-RTDP in three approximate settings: approximate model, approximate value updates, and approximate state representation. For these cases, we prove that the asymptotic performance of $h$-RTDP remains the same as that of a corresponding approximate DP algorithm, the best one can hope for without further assumptions on the approximation errors.

lookahead policy, name change, online planning, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies

Neural Information Processing SystemsOct-2-2025, 09:51:04 GMT

State-of-the-art efficient model-based Reinforcement Learning (RL) algorithms typically act by iteratively solving empirical models, i.e., by performing full-planning

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country: Asia > Middle East (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

the final version, we will better emphasize their value as it seems their importance was not properly conveyed

Neural Information Processing SystemsOct-2-2025, 09:45:59 GMT

We would like to begin by highlighting two contributions of the paper we feel remained unnoticed by R#2 and R#3. Due to its generality it is a powerful tool and is indeed central in all our analysis. RTDP is a well known and practical algorithm. We thank the reviewer for his/her favorable review. Abstract/Line 124/Line 263 - will be corrected, thanks!

artificial intelligence, final version, machine learning, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback