ProSh: Probabilistic Shielding for Model-free Reinforcement Learning