Adaptive Regularization for Large-Scale Sparse Feature Embedding Models
The one-epoch overfitting problem has drawn widespread attention, especially in CTR and CVR estimation models in search, advertising, and recommendation domains. These models which rely heavily on large-scale sparse categorical features, often suffer a significant decline in performance when trained for multiple epochs. Although recent studies have proposed heuristic solutions, the fundamental cause of this phenomenon remains unclear. In this work, we present a theoretical explanation grounded in Rademacher complexity, supported by empirical experiments, to explain why overfitting occurs in models with large-scale sparse categorical features. Based on this analysis, we propose a regularization method that constrains the norm budget of embedding layers adaptively. Our approach not only prevents the severe performance degradation observed during multi-epoch training, but also improves model performance within a single epoch. This method has already been deployed in online production systems. Click-through rate (CTR) and conversion rate (CVR) estimation are critical for advertising, search and recommendation (ASR) applications. E-commerce platforms like Amazon and Taobao rely on optimizing CTR and CVR estimation to boost gross merchandise volume (GMV), while advertising platforms at Google and Meta depend on it to drive revenue growth.
Nov-11-2025
- Country:
- Asia > China
- Shandong Province > Dongying (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > Canada
- British Columbia (0.04)
- Asia > China
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Information Technology > Services (0.88)
- Technology: