Demystifying the Recency Heuristic in Temporal-Difference Learning