Diverse Policy Optimization for Structured Action Space