Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis

Dec-26-2025, 22:19:03 GMT–Neural Information Processing Systems

To make reinforcement learning more sample efficient, we need better credit assignment methods that measure an action's influence on future rewards. Building upon Hindsight Credit Assignment (HCA), we introduce Counterfactual Contribution Analysis (COCOA), a new family of model-based credit assignment algorithms. Our algorithms achieve precise credit assignment by measuring the contribution of actions upon obtaining subsequent rewards, by quantifying a counterfactual query: 'Would the agent still have reached this reward if it had taken another action?'. We show that measuring contributions w.r.t.

assignment, credit assignment, long-term credit assignment, (7 more...)

Neural Information Processing Systems

Dec-26-2025, 22:19:03 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.42)