Appendix: On the Expressivity of Markov Reward

Neural Information Processing Systems 

We first address questions that might arise in response to the main text. What does it mean for Bob to *solve* one of these tasks? PO, or TO for Bob to learn to solve, when can Alice determine Bob has solved the task? A: Indeed, as discussed in our introduction, our goal is to examine the expressivity of Markov rewards in the context of finite MDPs. Instead, we suggest that for a given CMP, it is natural to be interested in Markov rewards, but acknowledge the importance of going beyond such functions.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found