Goto

Collaborating Authors

 ctr






CTR Prediction on Alibaba's Taobao Advertising Dataset Using Traditional and Deep Learning Models

Yang, Hongyu, Wen, Chunxi, Zhang, Jiyin, Shen, Nanfei, Zhang, Shijiao, Han, Xiyan

arXiv.org Artificial Intelligence

Click-through rates prediction is critical in modern advertising systems, where ranking relevance and user engagement directly impact platform efficiency and business value. In this project, we explore how to model CTR more effectively using a large-scale Taobao dataset released by Alibaba. We start with supervised learning models, including logistic regression and Light-GBM, that are trained on static features such as user demographics, ad attributes, and contextual metadata. These models provide fast, interpretable benchmarks, but have limited capabilities to capture patterns of behavior that drive clicks. To better model user intent, we combined behavioral data from hundreds of millions of interactions over a 22-day period. By extracting and encoding user action sequences, we construct representations of user interests over time. We use deep learning models to fuse behavioral embeddings with static features. Among them, multilayer perceptrons (MLPs) have achieved significant performance improvements. To capture temporal dynamics, we designed a Transformer-based architecture that uses a self-attention mechanism to learn contextual dependencies across behavioral sequences, modeling not only what the user interacts with, but also the timing and frequency of interactions. Transformer improves AUC by 2.81 % over the baseline (LR model), with the largest gains observed for users whose interests are diverse or change over time. In addition to modeling, we propose an A/B testing strategy for real-world evaluation. We also think about the broader implications: personalized ad targeting technology can be applied to public health scenarios to achieve precise delivery of health information or behavior guidance. Our research provides a roadmap for advancing click-through rate predictions and extending their value beyond e-commerce.


Local Collaborative Filtering: A Collaborative Filtering Method that Utilizes Local Similarities among Users

Shen, Zhaoxin, Wu, Dan

arXiv.org Artificial Intelligence

To leverage user behavior data from the Internet more effectively in recommender systems, this paper proposes a novel collaborative filtering (CF) method called Local Collaborative Filtering (LCF). LCF utilizes local similarities among users and integrates their data using the law of large numbers (LLN), thereby improving the utilization of user behavior data. Experiments are conducted on the Steam game dataset, and the results of LCF align with real-world needs.





Beyond Quality: Unlocking Diversity in Ad Headline Generation with Large Language Models

Wang, Chang, Yan, Siyu, Yuan, Depeng, Chen, Yuqi, Huang, Yanhua, Zheng, Yuanhang, Li, Shuhao, Zhang, Yinqi, Chen, Kedi, Zhu, Mingrui, Xu, Ruiwen

arXiv.org Artificial Intelligence

The generation of ad headlines plays a vital role in modern advertising, where both quality and diversity are essential to engage a broad range of audience segments. Current approaches primarily optimize language models for headline quality or click-through rates (CTR), often overlooking the need for diversity and resulting in homogeneous outputs. To address this limitation, we propose DIVER, a novel framework based on large language models (LLMs) that are jointly optimized for both diversity and quality. We first design a semantic- and stylistic-aware data generation pipeline that automatically produces high-quality training pairs with ad content and multiple diverse headlines. To achieve the goal of generating high-quality and diversified ad headlines within a single forward pass, we propose a multi-stage multi-objective optimization framework with supervised fine-tuning (SFT) and reinforcement learning (RL). Experiments on real-world industrial datasets demonstrate that DIVER effectively balances quality and diversity. Deployed on a large-scale content-sharing platform serving hundreds of millions of users, our framework improves advertiser value (ADVV) and CTR by 4.0% and 1.4%.