Experiments with Infinite-Horizon, Policy-Gradient Estimation