Discretizing Continuous Action Space for On-Policy Optimization
–arXiv.org Artificial Intelligence
The combinations of joint atomic actions, which quickly becomes explosion in the number of discrete actions can intractable when M increases. However, a simple fix be efficiently addressed by a policy with factorized is to represent the joint distribution over discrete actions as distribution across action dimensions. We factorized across dimensions, so that the joint policy is still show that the discrete policy achieves significant tractable. As prior works have applied such discretization performance gains with state-of-the-art on-policy method in practice (OpenAI, 2018; Jaśkowski et al., 2018), optimization algorithms (PPO, TRPO, ACKTR) we aim to carry out a systemic study of such straightforward especially on high-dimensional tasks with complex discretization method in simulated environments, and show dynamics. Additionally, we show that an ordinal how they improve upon on-policy optimization baselines.
arXiv.org Artificial Intelligence
Feb-1-2019
- Country:
- North America > United States
- New York > New York County > New York City (0.04)
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (0.40)
- Technology: