Going Beyond Heuristics by Imposing Policy Improvement as a Constraint Chi-Chang Lee
–Neural Information Processing Systems
As such, we prevent policies from merely exploiting heuristic rewards without improving the task reward.
Neural Information Processing Systems
Nov-20-2025, 07:17:05 GMT
- Country:
- Asia
- Middle East > Jordan (0.04)
- Taiwan (0.04)
- North America > United States
- Massachusetts (0.04)
- Asia
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Government (0.92)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Reinforcement Learning (0.94)
- Representation & Reasoning
- Agents (1.00)
- Optimization (1.00)
- Robots (1.00)
- Information Technology > Artificial Intelligence