Stable deep reinforcement learning method by predicting uncertainty in rewards as a subtask

Open in new window