Proximal Policy Optimization with Continuous Bounded Action Space via the Beta Distribution

Open in new window