dual learning algorithm
Contextual Dual Learning Algorithm with Listwise Distillation for Unbiased Learning to Rank
Yu, Lulu, Bi, Keping, Ni, Shiyu, Guo, Jiafeng
Unbiased Learning to Rank (ULTR) aims to leverage biased implicit user feedback (e.g., click) to optimize an unbiased ranking model. The effectiveness of the existing ULTR methods has primarily been validated on synthetic datasets. However, their performance on real-world click data remains unclear. Recently, Baidu released a large publicly available dataset of their web search logs. Subsequently, the NTCIR-17 ULTRE-2 task released a subset dataset extracted from it. We conduct experiments on commonly used or effective ULTR methods on this subset to determine whether they maintain their effectiveness. In this paper, we propose a Contextual Dual Learning Algorithm with Listwise Distillation (CDLA-LD) to simultaneously address both position bias and contextual bias. We utilize a listwise-input ranking model to obtain reconstructed feature vectors incorporating local contextual information and employ the Dual Learning Algorithm (DLA) method to jointly train this ranking model and a propensity model to address position bias. As this ranking model learns the interaction information within the documents list of the training set, to enhance the ranking model's generalization ability, we additionally train a pointwise-input ranking model to learn the listwise-input ranking model's capability for relevance judgment in a listwise manner. Extensive experiments and analysis confirm the effectiveness of our approach.
Dual Learning Algorithm for Delayed Feedback in Display Advertising
Saito, Yuta, Morishita, Gota, Yasui, Shota
In display advertising, predicting the conversion rate, that is, the probability that a user takes a predefined action on an advertiser's website is fundamental in estimating the value of showing a user an advertisement. There are two troublesome difficulties in the conversion rate prediction due to the delayed feedback. First, some positive labels are not correctly observed in training data, because some conversions do not occur right after clicking the ads. Moreover, the delay mechanism is not uniform among instances; some positive feedback is much more frequently observed than the others. It is widely acknowledged that these problems cause a severe bias in the naive empirical average loss function for the conversion rate prediction. To overcome the challenges, we propose two unbiased estimators, one for the conversion rate prediction, and the other for the bias estimation. Subsequently, we propose an interactive learning algorithm named {\em Dual Learning Algorithm for Delayed Feedback (DLA-DF)} where a conversion rate predictor and a bias estimator are learned alternately. The proposed algorithm is the first of its kind to address the two major challenges in a theoretically principal way. Lastly, we conducted a simulation experiment to demonstrate that the proposed method outperforms the existing baselines and validate that the unbiased estimation approach is suitable for the delayed feedback problem.
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Marketing (1.00)
- Information Technology > Services (0.85)