Truncated Variance Reduced Value Iteration

Mar-22-2026, 15:30:32 GMT–Neural Information Processing Systems

We provide faster randomized algorithms for computing an $\epsilon$-optimal policy in a discounted Markov decision process with $A_{\text{tot}}$-state-action pairs, bounded rewards, and discount factor $\gamma$.

artificial intelligence, machine learning, proceedings, (11 more...)

Neural Information Processing Systems

Mar-22-2026, 15:30:32 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.39)