Optimizing Agent Behavior over Long Time Scales by Transporting Value