Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Neural Information Processing Systems 

We prove that, for a linear parametrization, gradient descent converges to global optima despite non-linearity and non-convexity introduced by the implicit representation.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found