Reinforcement Learning to Optimize Lifetime Value in Cold-Start Recommendation
Ji, Luo, Qi, Qin, Han, Bingqing, Yang, Hongxia
–arXiv.org Artificial Intelligence
Recommender system plays a crucial role in modern E-commerce platform. Due to the lack of historical interactions between users and items, cold-start recommendation is a challenging problem. In order to alleviate the cold-start issue, most existing methods introduce content and contextual information as the auxiliary information. Nevertheless, these methods assume the recommended items behave steadily over time, while in a typical E-commerce scenario, items generally have very different performances throughout their life period. In such a situation, it would be beneficial to consider the long-term return from the item perspective, which is usually ignored in conventional methods. Reinforcement learning (RL) naturally fits such a long-term optimization problem, in which the recommender could identify high potential items, proactively allocate more user impressions to boost their growth, therefore improve the multi-period cumulative gains. Inspired by this idea, we model the process as a Partially Observable and Controllable Markov Decision Process (POC-MDP), and propose an actor-critic RL framework (RL-LTV) to incorporate the item lifetime values (LTV) into the recommendation. In RL-LTV, the critic studies historical trajectories of items and predict the future LTV of fresh item, while the actor suggests a score-based policy which maximizes the future LTV expectation. Scores suggested by the actor are then combined with classical ranking scores in a dual-rank framework, therefore the recommendation is balanced with the LTV consideration. Our method outperforms the strong live baseline with a relative improvement of 8.67% and 18.03% on IPV and GMV of cold-start items, on one of the largest E-commerce platform.
arXiv.org Artificial Intelligence
Aug-20-2021
- Country:
- Oceania > Australia (0.05)
- North America > United States
- New York > New York County > New York City (0.04)
- Asia
- Middle East > Jordan (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- China
- Zhejiang Province > Hangzhou (0.04)
- Beijing > Beijing (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology > Services > e-Commerce Services (0.75)