172ef5a94b4dd0aa120c6878fc29f70c-AuthorFeedback.pdf

Neural Information Processing Systems 

We thank all reviewers for their valuable feedback. We believe our results make a significant contribution to the field of theoretical reinforcement learning. Therefore, analyzing a variant of Nash Q-learning may be of independent interest. Since NE always exists, CCE always exists, i.e., the set of linear constraints are always feasible. The "hat" version is the actual certified policy (which can be executed as in Algorithm 2 and 4).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found