Counterfactual Evaluation of Ads Ranking Models through Domain Adaptation

Radwan, Mohamed A., Bhattacharjee, Himaghna, Lanners, Quinn, Zhang, Jiasheng, Karakulak, Serkan, Nassif, Houssam, Bayir, Murat Ali

arXiv.org Artificial Intelligence 

We propose a domain-adapted reward model that works alongside an Offline A/B testing system for evaluating ranking models. This approach effectively measures reward for ranking model changes in large-scale Ads recommender systems, where model-free methods like IPS are not feasible. Our experiments demonstrate that the proposed technique outperforms both the vanilla IPS method and approaches using non-generalized reward models.