Variational Bayesian Reinforcement Learning with Regret Bounds

Neural Information Processing Systems 

In reinforcement learning the Q-values summarize the expected future rewards that the agent will attain.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found