Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization
–Neural Information Processing Systems
We prove that, for a linear parametrization, gradient descent converges to global optima despite non-linearity and non-convexity introduced by the implicit representation.
Neural Information Processing Systems
Dec-27-2025, 17:47:59 GMT
- Country:
- Europe
- Germany > Baden-Württemberg
- Freiburg (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.14)
- Germany > Baden-Württemberg
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.04)
- Europe
- Genre:
- Research Report > New Finding (0.69)
- Technology: