Bayesian Control of Large MDPs with Unknown Dynamics in Data-Poor Environments