Navigating to the Best Policy in Markov Decision Processes

Neural Information Processing Systems 

Somewhat surprisingly, learning in a Markov Decision Process is most often considered under the performance criteria of consistency or regret minimization (see e.g.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found