Retail
Can LLM Agents Simulate Multi-Turn Human Behavior? Evidence from Real Online Customer Behavior Data
Lu, Yuxuan, Huang, Jing, Han, Yan, Yao, Bingsheng, Bei, Sisong, Gesi, Jiri, Xie, Yaochen, Zheshen, null, Wang, null, He, Qi, Wang, Dakuo
Recent research shows that LLM Agents can generate ``believable'' human behaviors via prompt-only methods, and such agents have been increasingly adopted in downstream applications. However, existing evaluation of these agents only focuses on qualitative believability (whether human raters think they are accurate), leaving open questions of whether LLM agents can accurately generate step-by-step actions mimicking a particular human's behavior in a multi-turn interaction task. In this work, we take shopping as a case study and present the first large-scale quantitative evaluation of state-of-the-art LLMs' ability to accurately simulate human behavior. Using real-world data from 31,865 online shopping sessions containing 230,965 user actions, our evaluation reveals that prompt-based LLMs (DeepSeek-R1, Llama, Claude) achieve only 11.86% accuracy in generating human actions, highlighting a substantial gap in actual behavioral accuracy. Through experiments, we also showcase that strategies as simple as fine-tuning LLMs on real human click-through data augmented with synthesized reasoning traces can greatly enhance models' performance. The fine-tuned Qwen2.5-7B achieves 17.26% action generation accuracy and 33.86% F1 score on final purchase prediction, representing substantial improvements of 5.4% and 13.85% over prompt-only baselines. This work establishes the first rigorous benchmark for human behavior simulation and provides actionable insights for developing more accurate LLM agents for future downstream applications.
Parameter-Free Federated TD Learning with Markov Noise in Heterogeneous Environments
Naskar, Ankur, Thoppe, Gugan, Negi, Utsav, Gupta, Vijay
Federated learning (FL) can dramatically speed up reinforcement learning by distributing exploration and training across multiple agents. It can guarantee an optimal convergence rate that scales linearly in the number of agents, i.e., a rate of $\tilde{O}(1/(NT)),$ where $T$ is the iteration index and $N$ is the number of agents. However, when the training samples arise from a Markov chain, existing results on TD learning achieving this rate require the algorithm to depend on unknown problem parameters. We close this gap by proposing a two-timescale Federated Temporal Difference (FTD) learning with Polyak-Ruppert averaging. Our method provably attains the optimal $\tilde{O}(1/NT)$ rate in both average-reward and discounted settings--offering a parameter-free FTD approach for Markovian data. Although our results are novel even in the single-agent setting, they apply to the more realistic and challenging scenario of FL with heterogeneous environments.
We Found 136 of the Best Prime Day Deals Still on for 2025: Up to 55% Off
Amazon's fall Prime Day sale has come and gone, but a few of the best deals are still available. All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links. Amazon Prime's Latest Prime Day sale has come and gone. If you are a Prime member who missed out, there's some good news--there are some leftover deals still going strong. We're still keeping you updated here with all the best markdowns on our favorite tech gear and gadgets that are still available, from Alexa-enabled speakers to robot vacs to laptops and tablets. The WIRED Reviews team tests products year-round, and at sales events like this, we only recommend deals on stuff we have actually used and approved. We sorted through thousands of deals by hand to make these picks. The Fire HD 10 is Amazon's best tablet for most people . The current model dates from 2023, but the Octa Core processor is plenty fast enough for consuming Amazon Prime content, which is really the primary reason to buy a Fire tablet. The full HD (1080p) screen won't win any awards, but it's good enough for streaming movies. Fire tablets can do double duty as an Echo speaker, too. Turn on Show Mode (swipe down on the notification overlay and check the Show Mode box) and you can query Alexa to your heart's content.