What Limits LLM-based Human Simulation: LLMs or Our Design?

Wang, Qian, Wu, Jiaying, Tang, Zhenheng, Luo, Bingqiao, Chen, Nuo, Chen, Wei, He, Bingsheng

arXiv.org Artificial Intelligence 

Wang et al., 2024b; Wu et al., 2024; Zhang et al., 2024b), as Recent studies have revealed significant shown in Figure 1. Initial successes in LLM-based human gaps between LLM-based human simulations and simulations have been demonstrated across diverse fields, real-world observations, highlighting these dual including society, economics, policy, and psychology (Chen challenges. To address these gaps, we present a et al., 2024a; Li et al., 2024b;f; Lin et al., 2023; Park et al., comprehensive analysis of LLM limitations and 2023b; Yang et al., 2024b). Moreover, reliable LLM simulations our design issues, proposing targeted solutions can generate high-quality data for LLM training for both aspects. Furthermore, we explore future (Tang et al., 2024; Zhang et al., 2024a) and evaluate data directions that address both challenges simultaneously, quality (Chiang et al., 2024; Moniri et al., 2024; Xu et al., particularly in data collection, LLM generation, 2023b; Zheng et al., 2023b), serving as a data generator and and evaluation. To support further research evaluator (Gu et al., 2024; Li et al., 2024c; Son et al., 2024) in this field, we provide a curated collection of to enhance LLM pre-training and simulation abilities.