On Learning Intrinsic Rewards for Policy Gradient Methods
Zeyu Zheng, Junhyuk Oh, Satinder Singh
–Neural Information Processing Systems
In this paper we build on the Optimal Rewards Framework of Singh et al. [2010] that defines the optimal intrinsic reward function as one that when used by an RL agent achieves behavior that optimizes the
Neural Information Processing Systems
Nov-20-2025, 16:27:07 GMT
- Country:
- North America
- Canada > Quebec
- Montreal (0.04)
- United States > Michigan (0.04)
- Canada > Quebec
- North America
- Genre:
- Research Report (0.68)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.31)
- Technology: