Bayesian Optimistic Optimization: Optimistic

Neural Information Processing Systems 

In this paper, we consider the RL in Markov decision processes (MDPs), where the agent observes the state of the environment at each timestep and makes decisions accordingly.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found