OPeRA: A Dataset of Observation, Persona, Rationale, and Action for Evaluating LLMs on Human Online Shopping Behavior Simulation

Wang, Ziyi, Lu, Yuxuan, Li, Wenbo, Amini, Amirali, Sun, Bo, Bart, Yakov, Lyu, Weimin, Gesi, Jiri, Wang, Tian, Huang, Jing, Su, Yu, Ehsan, Upol, Alikhani, Malihe, Li, Toby Jia-Jun, Chilton, Lydia, Wang, Dakuo

Jul-25-2025–arXiv.org Artificial Intelligence

Can large language models (LLMs) accurately simulate the next web action of a specific user? While LLMs have shown promising capabilities in generating ``believable'' human behaviors, evaluating their ability to mimic real user behaviors remains an open challenge, largely due to the lack of high-quality, publicly available datasets that capture both the observable actions and the internal reasoning of an actual human user. To address this gap, we introduce OPERA, a novel dataset of Observation, Persona, Rationale, and Action collected from real human participants during online shopping sessions. OPERA is the first public dataset that comprehensively captures: user personas, browser observations, fine-grained web actions, and self-reported just-in-time rationales. We developed both an online questionnaire and a custom browser plugin to gather this dataset with high fidelity. Using OPERA, we establish the first benchmark to evaluate how well current LLMs can predict a specific user's next action and rationale with a given persona and history. This dataset lays the groundwork for future research into LLM agents that aim to act as personalized digital twins for human.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Jul-25-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.28)

Genre:
- Research Report (1.00)
- Questionnaire & Opinion Survey (1.00)
- Personal > Interview (0.94)

Industry:
- Retail > Online (1.00)
- Education > Educational Setting (0.93)
- Information Technology > Services
  - e-Commerce Services (0.86)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.95)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found