Variational Inference for Data-Efficient Model Learning in POMDPs