Policy composition in reinforcement learning via multi-objective policy optimization