Reinforcement Learning under Model Mismatch

Aurko Roy, Huan Xu, Sebastian Pokutta

Neural Information Processing Systems 

We scale up the robust algorithms to large MDPs via function approximation and prove convergence under two different settings.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found