ctr model
From Scaling to Structured Expressivity: Rethinking Transformers for CTR Prediction
Yan, Bencheng, Lei, Yuejie, Zeng, Zhiyuan, Wang, Di, Lin, Kaiyi, Wang, Pengjie, Xu, Jian, Zheng, Bo
Despite massive investments in scale, deep models for click-through rate (CTR) prediction often exhibit rapidly diminishing returns - a stark contrast to the smooth, predictable gains seen in large language models. We identify the root cause as a structural misalignment: Transformers assume sequential compositionality, while CTR data demand combinatorial reasoning over high-cardinality semantic fields. Unstructured attention spreads capacity indiscriminately, amplifying noise under extreme sparsity and breaking scalable learning. To restore alignment, we introduce the Field-Aware Transformer (FAT), which embeds field-based interaction priors into attention through decomposed content alignment and cross-field modulation. This design ensures model complexity scales with the number of fields F, not the total vocabulary size n >> F, leading to tighter generalization and, critically, observed power-law scaling in AUC as model width increases. We present the first formal scaling law for CTR models, grounded in Rademacher complexity, that explains and predicts this behavior. On large-scale benchmarks, FAT improves AUC by up to +0.51% over state-of-the-art methods. Deployed online, it delivers +2.33% CTR and +0.66% RPM. Our work establishes that effective scaling in recommendation arises not from size, but from structured expressivity-architectural coherence with data semantics.
asLLR: LLM based Leads Ranking in Auto Sales
Sun, Yin, Liu, Yiwen, Song, Junjie, Zhang, Chenyu, Zhang, Xinyuan, Liu, Lingjie, Chen, Siqi, Cao, Yuji
In the area of commercial auto sales system, high-quality lead score sequencing determines the priority of a sale's work and is essential for optimizing the efficiency of the sales system. Since CRM (Customer Relationship Management) system contains plenty of textual interaction features between sales and customers, traditional techniques such as Click Through Rate (CTR) prediction struggle with processing the complex information inherent in natural language features, which limits their effectiveness in sales lead ranking. Bridging this gap is critical for enhancing business intelligence and decision-making. Recently, the emergence of large language models (LLMs) has opened new avenues for improving recommendation systems, this study introduces asLLR (LLM-based Leads Ranking in Auto Sales), which integrates CTR loss and Question Answering (QA) loss within a decoder-only large language model architecture. This integration enables the simultaneous modeling of both tabular and natural language features. To verify the efficacy of asLLR, we constructed an innovative dataset derived from the customer lead pool of a prominent new energy vehicle brand, with 300,000 training samples and 40,000 testing samples. Our experimental results demonstrate that asLLR effectively models intricate patterns in commercial datasets, achieving the AUC of 0.8127, surpassing traditional CTR estimation methods by 0.0231. Moreover, asLLR enhances CTR models when used for extracting text features by 0.0058. In real-world sales scenarios, after rigorous online A/B testing, asLLR increased the sales volume by about 9.5% compared to the traditional method, providing a valuable tool for business intelligence and operational decision-making.
- Automobiles & Trucks (1.00)
- Transportation > Ground > Road (0.66)
- Transportation > Electric Vehicle (0.48)
APG: Adaptive Parameter Generation Network for Click-Through Rate Prediction
In many web applications, deep learning-based CTR prediction models (deep CTR models for short) are widely adopted. Traditional deep CTR models learn patterns in a static manner, i.e., the network parameters are the same across all the instances. However, such a manner can hardly characterize each of the instances which may have different underlying distributions. It actually limits the representation power of deep CTR models, leading to sub-optimal results. In this paper, we propose an efficient, effective, and universal module, named as Adaptive Parameter Generation network (APG), which can dynamically generate parameters for deep CTR models on-the-fly based on different instances. Extensive experimental evaluation results show that APG can be applied to a variety of deep CTR models and significantly improve their performance. Meanwhile, APG can reduce the time cost by 38.7% and memory usage by 96.6% compared to a regular deep CTR model. We have deployed APG in the industrial sponsored search system and achieved 3% CTR gain and 1% RPM gain respectively.
Clicks Versus Conversion: Choosing a Recommender's Training Objective in E-Commerce
Weiss, Michael, Rosenbach, Robert, Eggenberger, Christian
Ranking product recommendations to optimize for a high click-through rate (CTR) or for high conversion, such as add-to-cart rate (ACR) and Order-Submit-Rate (OSR, view-to-purchase conversion) are standard practices in e-commerce. Optimizing for CTR appears like a straightforward choice: Training data (i.e., click data) are simple to collect and often available in large quantities. Additionally, CTR is used far beyond e-commerce, making it a generalist, easily implemented option. ACR and OSR, on the other hand, are more directly linked to a shop's business goals, such as the Gross Merchandise Value (GMV). In this paper, we compare the effects of using either of these objectives using an online A/B test. Among our key findings, we demonstrate that in our shops, optimizing for OSR produces a GMV uplift more than five times larger than when optimizing for CTR, without sacrificing new product discovery. Our results also provide insights into the different feature importances for each of the objectives.
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > e-Commerce (0.96)
- Information Technology > Data Science (0.94)
Generative Click-through Rate Prediction with Applications to Search Advertising
Kong, Lingwei, Wang, Lu, Peng, Changping, Lin, Zhangang, Law, Ching, Shao, Jingping
Click-Through Rate (CTR) prediction models are integral to a myriad of industrial settings, such as personalized search advertising. Current methods typically involve feature extraction from users' historical behavior sequences combined with product information, feeding into a discriminative model that is trained on user feedback to estimate CTR. With the success of models such as GPT, the potential for generative models to enrich expressive power beyond discriminative models has become apparent. In light of this, we introduce a novel model that leverages generative models to enhance the precision of CTR predictions in discriminative models. To reconcile the disparate data aggregation needs of both model types, we design a two-stage training process: 1) Generative pre-training for next-item prediction with the given item category in user behavior sequences; 2) Fine-tuning the well-trained generative model within a discriminative CTR prediction framework. Our method's efficacy is substantiated through extensive experiments on a new dataset, and its significant utility is further corroborated by online A/B testing results. Currently, the model is deployed on one of the world's largest e-commerce platforms, and we intend to release the associated code and dataset in the future.
Scaled Supervision is an Implicit Lipschitz Regularizer
Ouyang, Zhongyu, Zhang, Chunhui, Jia, Yaning, Vosoughi, Soroush
In modern social media, recommender systems (RecSys) rely on the click-through rate (CTR) as the standard metric to evaluate user engagement. CTR prediction is traditionally framed as a binary classification task to predict whether a user will interact with a given item. However, this approach overlooks the complexity of real-world social modeling, where the user, item, and their interactive features change dynamically in fast-paced online environments. This dynamic nature often leads to model instability, reflected in overfitting short-term fluctuations rather than higher-level interactive patterns. While overfitting calls for more scaled and refined supervisions, current solutions often rely on binary labels that overly simplify fine-grained user preferences through the thresholding process, which significantly reduces the richness of the supervision. Therefore, we aim to alleviate the overfitting problem by increasing the supervision bandwidth in CTR training. Specifically, (i) theoretically, we formulate the impact of fine-grained preferences on model stability as a Lipschitz constrain; (ii) empirically, we discover that scaling the supervision bandwidth can act as an implicit Lipschitz regularizer, stably optimizing existing CTR models to achieve better generalizability. Extensive experiments show that this scaled supervision significantly and consistently improves the optimization process and the performance of existing CTR models, even without the need for additional hyperparameter tuning.
Balancing Efficiency and Effectiveness: An LLM-Infused Approach for Optimized CTR Prediction
Zhang, Guoxiao, Wei, Yi, Zhang, Yadong, Feng, Huajian, Liu, Qiang
Click-Through Rate (CTR) prediction is essential in online advertising, where semantic information plays a pivotal role in shaping user decisions and enhancing CTR effectiveness. Capturing and modeling deep semantic information, such as a user's preference for "H\"aagen-Dazs' HEAVEN strawberry light ice cream" due to its health-conscious and premium attributes, is challenging. Traditional semantic modeling often overlooks these intricate details at the user and item levels. To bridge this gap, we introduce a novel approach that models deep semantic information end-to-end, leveraging the comprehensive world knowledge capabilities of Large Language Models (LLMs). Our proposed LLM-infused CTR prediction framework(Multi-level Deep Semantic Information Infused CTR model via Distillation, MSD) is designed to uncover deep semantic insights by utilizing LLMs to extract and distill critical information into a smaller, more efficient model, enabling seamless end-to-end training and inference. Importantly, our framework is carefully designed to balance efficiency and effectiveness, ensuring that the model not only achieves high performance but also operates with optimal resource utilization. Online A/B tests conducted on the Meituan sponsored-search system demonstrate that our method significantly outperforms baseline models in terms of Cost Per Mile (CPM) and CTR, validating its effectiveness, scalability, and balanced approach in real-world applications.
- Asia > China > Beijing > Beijing (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Hunan Province > Changsha (0.04)
Explainable CTR Prediction via LLM Reasoning
Yu, Xiaohan, Zhang, Li, Chen, Chong
Recommendation Systems have become integral to modern user experiences, but lack transparency in their decision-making processes. Existing explainable recommendation methods are hindered by reliance on a post-hoc paradigm, wherein explanation generators are trained independently of the underlying recommender models. This paradigm necessitates substantial human effort in data construction and raises concerns about explanation reliability. In this paper, we present ExpCTR, a novel framework that integrates large language model based explanation generation directly into the CTR prediction process. Inspired by recent advances in reinforcement learning, we employ two carefully designed reward mechanisms, LC alignment, which ensures explanations reflect user intentions, and IC alignment, which maintains consistency with traditional ID-based CTR models. Our approach incorporates an efficient training paradigm with LoRA and a three-stage iterative process. ExpCTR circumvents the need for extensive explanation datasets while fostering synergy between CTR prediction and explanation generation. Experimental results demonstrate that ExpCTR significantly enhances both recommendation accuracy and interpretability across three real-world datasets.
- Europe > Germany > Lower Saxony > Hanover (0.05)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Fusion Matters: Learning Fusion in Deep Click-through Rate Prediction Models
Zhang, Kexin, Lyu, Fuyuan, Tang, Xing, Liu, Dugang, Ma, Chen, Ding, Kaize, He, Xiuqiang, Liu, Xue
The evolution of previous Click-Through Rate (CTR) models has mainly been driven by proposing complex components, whether shallow or deep, that are adept at modeling feature interactions. However, there has been less focus on improving fusion design. Instead, two naive solutions, stacked and parallel fusion, are commonly used. Both solutions rely on pre-determined fusion connections and fixed fusion operations. It has been repetitively observed that changes in fusion design may result in different performances, highlighting the critical role that fusion plays in CTR models. While there have been attempts to refine these basic fusion strategies, these efforts have often been constrained to specific settings or dependent on specific components. Neural architecture search has also been introduced to partially deal with fusion design, but it comes with limitations. The complexity of the search space can lead to inefficient and ineffective results. To bridge this gap, we introduce OptFusion, a method that automates the learning of fusion, encompassing both the connection learning and the operation selection. We have proposed a one-shot learning algorithm tackling these tasks concurrently. Our experiments are conducted over three large-scale datasets. Extensive experiments prove both the effectiveness and efficiency of OptFusion in improving CTR model performance. Our code implementation is available here\url{https://github.com/kexin-kxzhang/OptFusion}.
- North America > Canada > Quebec > Montreal (0.14)
- Oceania > Australia > New South Wales > Sydney (0.14)
- Europe > Germany > Lower Saxony > Hanover (0.05)
- (28 more...)