Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction