Review for NeurIPS paper: POMO: Policy Optimization with Multiple Optima for Reinforcement Learning