Generator and Critic: A Deep Reinforcement Learning Approach for Slate Re-ranking in E-commerce
Wei, Jianxiong, Zeng, Anxiang, Wu, Yueqiu, Guo, Peng, Hua, Qingsong, Cai, Qingpeng
The slate re-ranking problem considers the mutual influences between items to improve user satisfaction in e-commerce, compared with the point-wise ranking. Previous works either directly rank items by an end to end model, or rank items by a score function that trades-off the point-wise score and the diversity between items. However, there are two main existing challenges that are not well studied: (1) the evaluation of the slate is hard due to the complex mutual influences between items of one slate; (2) even given the optimal evaluation, searching the optimal slate is challenging as the action space is exponentially large. In this paper, we present a novel Generator and Critic slate re-ranking approach, where the Critic evaluates the slate and the Generator ranks the items by the reinforcement learning approach. We propose a Full Slate Critic (FSC) model that considers the real impressed items and avoids the "impressed bias" of existing models. For the Generator, to tackle the Figure 1: The return list when searching "smart watch" problem of large action space, we propose a new exploration reinforcement learning algorithm, called PPO-Exploration. Experimental results show that the FSC model significantly outperforms the state of the art slate evaluation methods, and the PPO-Exploration To improve the diversity of the list, a series of research works, algorithm outperforms the existing reinforcement learning methods MMR, IA-Select, xQuAD and DUM [1, 4, 9, 24] are proposed to rank substantially. The Generator and Critic approach improves both items by weighted functions that tradeoff the user-item scores and the slate efficiency(4% gmv and 5% number of orders) and diversity the diversities of items. However, these methods ignore the impact in live experiments on one of the largest e-commerce websites in of diversity on the efficiency of the list.
May-25-2020
- Genre:
- Research Report > New Finding (0.34)
- Industry:
- Information Technology > Services > e-Commerce Services (1.00)
- Technology: