CARL: Critical Action Focused Reinforcement Learning for Multi-Step Agent
Shen, Leyang, Zhang, Yang, Ling, Chun Kai, Zhao, Xiaoyan, Chua, Tat-Seng
–arXiv.org Artificial Intelligence
Agents capable of accomplishing complex tasks through multiple interactions with the environment have emerged as a popular research direction. However, in such multi-step settings, the conventional group-level policy optimization algorithm becomes suboptimal because of its underlying assumption that each action holds equal contribution, which deviates significantly from reality. Our analysis reveals that only a small fraction of actions are critical in determining the final outcome. Building on this insight, we propose CARL, a critical-action-focused reinforcement learning algorithm tailored for multi-step agents. CARL achieves focused training through providing action-level optimization signals for high-criticality actions while excluding low-criticality actions from model update. Extensive experiments demonstrate that CARL achieves both stronger performance and higher efficiency during training and inference across diverse evaluation settings.
arXiv.org Artificial Intelligence
Dec-5-2025
- Country:
- Asia
- China (0.04)
- Singapore > Central Region
- Singapore (0.04)
- Asia
- Genre:
- Research Report > New Finding (0.93)
- Technology: