RD 2 : Reward Decomposition with Representation Decomposition

Oct-10-2024, 16:13:16 GMT–Neural Information Processing Systems

Reward decomposition, which aims to decompose the full reward into multiple sub-rewards, has been proven beneficial for improving sample efficiency in reinforcement learning. Existing works on discovering reward decomposition are mostly policy dependent, which constrains diverse or disentangled behavior between different policies induced by different sub-rewards. In this work, we propose a set of novel reward decomposition principles by constraining uniqueness and compactness of different state features/representations relevant to different sub-rewards. Our principles encourage sub-rewards with minimal relevant features, while maintaining the uniqueness of each sub-reward. We derive a deep learning algorithm based on our principle, and term our method as RD 2, since we learn reward decomposition and representation decomposition jointly.

decomposition, representation decomposition, reward decomposition, (2 more...)

Neural Information Processing Systems

Oct-10-2024, 16:13:16 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)