Kernelized Reinforcement Learning with Order Optimal Regret Bounds

Neural Information Processing Systems 

Our results show a significant polynomial in the number of episodes improvement over the state of the art.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found