Compositional Policy Learning in Stochastic Control Systems with Formal Guarantees Ðor de Žikeli c