Variational Model-based Policy Optimization