BooVI: Provably Efficient Bootstrapped Value Iteration

Dec-24-2025, 00:09:05 GMT–Neural Information Processing Systems

Despite the tremendous success of reinforcement learning (RL) with function approximation, efficient exploration remains a significant challenge, both practically and theoretically. In particular, existing theoretically grounded RL algorithms based on upper confidence bounds (UCBs), such as optimistic least-squares value iteration (LSVI), are often incompatible with practically powerful function approximators, such as neural networks. In this paper, we develop a variant of \underline{boo}tstrapped LS\underline{VI}, namely BooVI, which bridges such a gap between practice and theory.

boovi, name change, provably efficient bootstrapped value iteration, (6 more...)

Neural Information Processing Systems

Dec-24-2025, 00:09:05 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)