Non-crossing quantile regression for deep reinforcement learning
–Neural Information Processing Systems
Distributional reinforcement learning (DRL) estimates the distribution over future returns instead of the mean to more efficiently capture the intrinsic uncertainty of MDPs.
Neural Information Processing Systems
Aug-22-2025, 00:42:47 GMT
- Technology: