Conservative Offline Policy Adaptation in Multi-Agent Games

Neural Information Processing Systems 

We prove that CSP learns a near-optimal risk-free offline adaptation policy upon convergence.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found