Uncertainty-Based Out-of-Distribution Classification in Deep Reinforcement Learning