Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates

Open in new window