Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation