Policy Optimization for Markov Games: Unified Framework and Faster Convergence Runyu Zhang Harvard University

Neural Information Processing Systems 

Policy optimization, i.e. algorithms that learn to make sequential decisions by local search on the agent's policy directly, is a widely used class of algorithms in reinforcement learning [

Duplicate Docs Excel Report

Title
Unified

Similar Docs  Excel Report  more

TitleSimilaritySource
None found