Towards a Design Guideline for RPA Evaluation: A Survey of Large Language Model-Based Role-Playing Agents
Chen, Chaoran, Yao, Bingsheng, Zou, Ruishi, Hua, Wenyue, Lyu, Weimin, Li, Toby Jia-Jun, Wang, Dakuo
–arXiv.org Artificial Intelligence
Role-Playing Agent (RPA) is an increasingly popular type of LLM Agent that simulates human-like behaviors in a variety of tasks. However, evaluating RPAs is challenging due to diverse task requirements and agent designs. This paper proposes an evidence-based, actionable, and generalizable evaluation design guideline for LLM-based RPA by systematically reviewing 1,676 papers published between Jan. 2021 and Dec. 2024. Our analysis identifies six agent attributes, seven task attributes, and seven evaluation metrics from existing literature. Based on these findings, we present an RPA evaluation design guideline to help researchers develop more systematic and consistent evaluation methods.
arXiv.org Artificial Intelligence
Feb-18-2025
- Country:
- Asia
- North America
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- California
- San Diego County > San Diego (0.04)
- Santa Barbara County > Santa Barbara (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- New York
- New York County > New York City (0.04)
- Suffolk County > Stony Brook (0.04)
- California
- Mexico > Mexico City
- Genre:
- Overview (1.00)
- Research Report > New Finding (0.46)
- Industry:
- Education (1.00)
- Government (0.93)
- Health & Medicine
- Consumer Health (0.67)
- Therapeutic Area > Psychiatry/Psychology
- Mental Health (0.67)
- Technology: