baidu-ultr
- North America > United States (0.05)
- Europe > Spain > Galicia > Madrid (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- (3 more...)
Understanding the Effects of the Baidu-ULTR Logging Policy on Two-Tower Models
de Haan, Morris, Hager, Philipp
Despite the popularity of the two-tower model for unbiased learning to rank (ULTR) tasks, recent work suggests that it suffers from a major limitation that could lead to its collapse in industry applications: the problem of logging policy confounding. Several potential solutions have even been proposed; however, the evaluation of these methods was mostly conducted using semi-synthetic simulation experiments. This paper bridges the gap between theory and practice by investigating the confounding problem on the largest real-world dataset, Baidu-ULTR. Our main contributions are threefold: 1) we show that the conditions for the confounding problem are given on Baidu-ULTR, 2) the confounding problem bears no significant effect on the two-tower model, and 3) we point to a potential mismatch between expert annotations, the golden standard in ULTR, and user click behavior.
- Europe > Netherlands > North Holland > Amsterdam (0.05)
- Europe > Italy > Apulia > Bari (0.05)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.05)
- (4 more...)
A Large Scale Search Dataset for Unbiased Learning to Rank
Zou, Lixin, Mao, Haitao, Chu, Xiaokai, Tang, Jiliang, Ye, Wenwen, Wang, Shuaiqiang, Yin, Dawei
The unbiased learning to rank (ULTR) problem has been greatly advanced by recent deep learning techniques and well-designed debias algorithms. However, promising results on the existing benchmark datasets may not be extended to the practical scenario due to the following disadvantages observed from those popular benchmark datasets: (1) outdated semantic feature extraction where state-of-the-art large scale pre-trained language models like BERT cannot be exploited due to the missing of the original text;(2) incomplete display features for in-depth study of ULTR, e.g., missing the displayed abstract of documents for analyzing the click necessary bias; (3) lacking real-world user feedback, leading to the prevalence of synthetic datasets in the empirical study. To overcome the above disadvantages, we introduce the Baidu-ULTR dataset. It involves randomly sampled 1.2 billion searching sessions and 7,008 expert annotated queries, which is orders of magnitude larger than the existing ones. Baidu-ULTR provides:(1) the original semantic feature and a pre-trained language model for easy usage; (2) sufficient display information such as position, displayed height, and displayed abstract, enabling the comprehensive study of different biases with advanced techniques such as causal discovery and meta-learning; and (3) rich user feedback on search result pages (SERPs) like dwelling time, allowing for user engagement optimization and promoting the exploration of multi-task learning in ULTR. In this paper, we present the design principle of Baidu-ULTR and the performance of benchmark ULTR algorithms on this new data resource, favoring the exploration of ranking for long-tail queries and pre-training tasks for ranking. The Baidu-ULTR dataset and corresponding baseline implementation are available at https://github.com/ChuXiaokai/baidu_ultr_dataset.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- (3 more...)