CHARMS: A Cognitive Hierarchical Agent for Reasoning and Motion Stylization in Autonomous Driving

Wang, Jingyi, Chu, Duanfeng, Deng, Zejian, Lu, Liping, Wang, Jinxiang, Sun, Chen

arXiv.org Artificial Intelligence 

To address the limitations of these approaches, we propose CHARMS, a decision-making model based on Level-k game theory [20]. The distinction between our approach and the existing methods is illustrated in Figure 1. CHARMS incorporates cognitive hierarchy theory to model diverse reasoning depths among agents, coupled with Social V alue Orientation (SVO) to capture individual preferences in driving behavior. We employ a two-stage training process consisting of reinforcement learning pretraining and supervised fine-tuning (SFT) to generate decision-making models that exhibit a wide range of human-like driving styles. Additionally, we integrate Poisson cognitive hierarchy (PCH) theory to enable CHARMS to generate more complex simulation scenarios with diverse vehicle styles. The main contributions of this paper can be summarized as follows. A behavior model integrating Level-k reasoning and SVO is proposed to simulate cognitively diverse driving styles. A two-stage training scheme (DRL + SFT) ensures both style distinctiveness and behavioral realism. A scenario generation method based on PCH theory is used to control driving style distributions, with the aim of creating more realistic and behaviorally diverse simulation scenarios.