Appendix: On the Expressivity of Markov Reward
–Neural Information Processing Systems
We first address questions that might arise in response to the main text. What does it mean for Bob to *solve* one of these tasks? PO, or TO for Bob to learn to solve, when can Alice determine Bob has solved the task? A: Indeed, as discussed in our introduction, our goal is to examine the expressivity of Markov rewards in the context of finite MDPs. Instead, we suggest that for a given CMP, it is natural to be interested in Markov rewards, but acknowledge the importance of going beyond such functions.
Neural Information Processing Systems
Nov-20-2025, 08:52:34 GMT
- Technology: