Statistical Efficiency of Distributional Temporal Difference Learning Yang Peng

Neural Information Processing Systems 

Distributional reinforcement learning (DRL) has achieved empirical success in various domains.