Finite-SampleAnalysisofOff-PolicyTD-Learningvia GeneralizedBellmanOperators

Feb-10-2026, 19:20:21 GMT–Neural Information Processing Systems

Itisknown that policyevaluation has the interpretation of solving ageneralized Bellman equation. Inthispaper,wederivefinite-sample bounds foranygeneral off-policy TD-like stochastic approximation algorithm that solves for the fixedpoint of this generalized Bellman operator.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Feb-10-2026, 19:20:21 GMT

Conferences PDF

Add feedback

Country:
- Asia > Middle East > Jordan (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)

Duplicate Docs Excel Report

Title
Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Similar Docs Excel Report more

Title	Similarity	Source
None found