A Algorithms Algorithm 1: MAP Propagation - Monte-Carlo Policy-Gradient Control 1 Input: differentiable policy function: π