Deciding WhattoModel: Value-EquivalentSampling forReinforcementLearning