Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

Open in new window