Robust Exploratory Stopping under Ambiguity in Reinforcement Learning