Dissecting Deep RL with High Update Ratios: Combatting Value Overestimation and Divergence