Energy Regularized RNNs for Solving Non-Stationary Bandit Problems