Review 1
–Neural Information Processing Systems
See the detailed reasons below. In RL, it is widely known that studying the average reward is a more challenging topic. On top of the above, when coupled with the multi-agent setting, the average reward case brings additional challenges. Specifically, as shown in Appendix A.2 in the paper, our average reward problem captures certain NP-hard instances. Similar complexity results can be found in [Blondel and Tsitsiklis 2000].
Neural Information Processing Systems
Nov-13-2025, 10:18:45 GMT
- Technology: