Scalar Posterior Sampling with Applications

Georgios Theocharous, Zheng Wen, Yasin Abbasi Yadkori, Nikos Vlassis

Neural Information Processing Systems 

Our algorithm termed deterministic schedule PSRL (DS-PSRL) is efficient in terms of time, sample, and space complexity. We prove a Bayesian regret bound under mild assumptions. Our result is more generally applicable to multiple parameters and continuous state action problems. We compare our algorithm with state-of-the-art PSRL algorithms on standard discrete and continuous problems from the literature.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found