Online Shielding for Reinforcement Learning