DeepReinforcementLearninginaHandfulofTrials usingProbabilisticDynamicsModels