Learning Adaptive Exploration Strategies in Dynamic Environments Through Informed Policy Regularization