Do No Harm: A Counterfactual Approach to Safe Reinforcement Learning