Dynamic Shielding for Reinforcement Learning in Black-Box Environments