Model-Free Reinforcement Learning with the Decision-Estimation Coefficient Dylan J. Foster