Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs