System stabilization with policy optimization on unstable latent manifolds