Policy Optimization in Hybrid Discrete-Continuous Action Spaces via Mixed Gradients

Open in new window