Session-Level Dynamic Ad Load Optimization using Offline Robust Reinforcement Learning
Liu, Tao, Xu, Qi, Shi, Wei, Hua, Zhigang, Yang, Shuang
–arXiv.org Artificial Intelligence
Session-level dynamic ad load optimization aims to personalize the density and types of delivered advertisements in real time during a user's online session by dynamically balancing user experience quality and ad monetization. Traditional causal learning-based approaches struggle with key technical challenges, especially in handling confounding bias and distribution shifts. In this paper, we develop an offline deep Q-network (DQN)-based framework that effectively mitigates confounding bias in dynamic systems and demonstrates more than 80% offline gains compared to the best causal learning-based production baseline. Moreover, to improve the framework's robustness against unanticipated distribution shifts, we further enhance our framework with a novel offline robust dueling DQN approach. This approach achieves more stable rewards on multiple OpenAI-Gym datasets as perturbations increase, and provides an additional 5% offline gains on real-world ad delivery data. Deployed across multiple production systems, our approach has achieved outsized topline gains. Post-launch online A/B tests have shown double-digit improvements in the engagement-ad score trade-off efficiency, significantly enhancing our platform's capability to serve both consumers and advertisers.
arXiv.org Artificial Intelligence
Jan-9-2025
- Country:
- North America > United States (0.68)
- Genre:
- Research Report > Experimental Study (0.68)
- Industry:
- Marketing (0.66)
- Technology: