Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy

Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang

Neural Information Processing Systems 

See also, e.g., [1] fora Bayesianinferenceperspective.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found