Q-Learning with Shift-Aware Upper Confidence Bound in Non-Stationary Reinforcement Learning

Open in new window