Review 1

Neural Information Processing Systems 

See the detailed reasons below. In RL, it is widely known that studying the average reward is a more challenging topic. On top of the above, when coupled with the multi-agent setting, the average reward case brings additional challenges. Specifically, as shown in Appendix A.2 in the paper, our average reward problem captures certain NP-hard instances. Similar complexity results can be found in [Blondel and Tsitsiklis 2000].

Similar Docs  Excel Report  more

TitleSimilaritySource
None found