Average-Reward Reinforcement Learning with Trust Region Methods

Open in new window