Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction
Gai, Kun, Zhu, Xiaoqiang, Li, Han, Liu, Kai, Wang, Zhe
CTR prediction in real-world business is a difficult machine learning problem with large scale nonlinear sparse data. In this paper, we introduce an industrial strength solution with model named Large Scale Piece-wise Linear Model (LS-PLM). We formulate the learning problem with $L_1$ and $L_{2,1}$ regularizers, leading to a non-convex and non-smooth optimization problem. Then, we propose a novel algorithm to solve it efficiently, based on directional derivatives and quasi-Newton method. In addition, we design a distributed system which can run on hundreds of machines parallel and provides us with the industrial scalability. LS-PLM model can capture nonlinear patterns from massive sparse data, saving us from heavy feature engineering jobs. Since 2012, LS-PLM has become the main CTR prediction model in Alibaba's online display advertising system, serving hundreds of millions users every day.
Apr-18-2017
- Country:
- Asia (0.14)
- Genre:
- Research Report (0.51)
- Industry:
- Information Technology > Services (0.67)
- Marketing (0.49)
- Education > Focused Education
- Special Education (0.45)
- Technology: