Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach

Neural Information Processing Systems 

A major challenge in reinforcement learning is to determine which state-action pairs are responsible for future rewards that are delayed.