Dynamic Parameterized Network for CTR Prediction

Zhu, Jian, Liu, Congcong, Wang, Pei, Zhao, Xiwei, Chen, Guangpeng, Jin, Junsheng, Peng, Changping, Lin, Zhangang, Shao, Jingping

Nov-9-2021–arXiv.org Artificial Intelligence

Learning to capture feature relations effectively and efficiently is essential in clickthrough rate (CTR) prediction of modern recommendation systems. Most existing CTR prediction methods model such relations either through tedious manuallydesigned low-order interactions or through inflexible and inefficient high-order interactions, which both require extra DNN modules for implicit interaction modeling. In this paper, we proposed a novel plug-in operation, Dynamic Parameterized Operation (DPO), to learn both explicit and implicit interaction instance-wisely. We showed that the introduction of DPO into DNN modules and Attention modules can respectively benefit two main tasks in CTR prediction, enhancing the adaptiveness of feature-based modeling and improving user behavior modeling with the instance-wise locality. Our Dynamic Parameterized Networks significantly outperforms state-of-the-art methods in the offline experiments on the public dataset and real-world production dataset, together with an online A/B test. Furthermore, the proposed Dynamic Parameterized Networks has been deployed in the ranking system of one of the world's largest e-commerce companies, serving the main traffic of hundreds of millions of active users. Click-through rate (CTR) prediction, which aims to estimate the probability of a user clicking an item, is of great importance in recommendation systems and online advertising systems (Cheng et al., 2016; Guo et al., 2017; Rendle, 2010; Zhou et al., 2018b). Effective feature modeling and user behavior modeling are two critical parts of CTR prediction. Deep neural networks (DNNs) have achieved tremendous success on a variety of CTR prediction methods for feature modeling (Cheng et al., 2016; Guo et al., 2017; Wang et al., 2017). Under the hood, its core component is a linear transformation followed by a nonlinear function, which models weighted interaction between the flattened inputs and contexts by fixed kernels, regardless of the intrinsic decoupling relations from specific contexts (Rendle et al., 2020). This property makes DNN learn interaction in an implicit manner, while limiting its ability to model explicit relation, which is often captured by feature crossing component (Rendle, 2010; Song et al., 2019). Most existing solutions exploit a combinatorial framework (feature crossing component DNN component) to leverage both implicit and explicit feature interactions, which is suboptimal and inefficient (Cheng et al., 2016; Wang et al., 2017). For instance, wide & deep combines a linear module in the wide part for explicit low-order interaction and a DNN module to learn high-order feature interactions. Follow-up works such as Deep & Cross Network (DCN) follows a similar manner by replacing the wide part with more sophistic networks, however, posits restriction to input size which is inflexible.

artificial intelligence, information technology services, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Nov-9-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report > Promising Solution (0.34)

Industry:
- Information Technology > Services (0.68)
- Marketing (0.66)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.66)
  - Representation & Reasoning > Personal Assistant Systems (1.00)