Finite-Time Bounds for Average-Reward Fitted Q-Iteration

Open in new window