Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding