Policy Gradient for Rectangular Robust Markov Decision Processes

Neural Information Processing Systems 

However, they do not account for transition uncertainty, whereas learning robust policies can be computationally expensive.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found