Bridging Discrete and Continuous RL: Stable Deterministic Policy Gradient with Martingale Characterization

Open in new window