Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction

Apr 18, 2017

nonlinear ctr prediction alibaba logistic plm piece click stat-ml cs-lg

  • Kun Gai
  • Xiaoqiang Zhu
  • Han Li
  • Kai Liu
  • Zhe Wang

CTR prediction in real-world business is a difficult machine learning problem with large scale nonlinear sparse data. In this paper, we introduce an industrial strength solution with model named Large Scale Piece-wise Linear Model (LS-PLM). We formulate the learning problem with $L1$ and $L{2,1}$ regularizers, leading to a non-convex and non-smooth optimization problem. Then, we propose a novel algorithm to solve it efficiently, based on directional derivatives and quasi-Newton method. In addition, we design a distributed system which can run on hundreds of machines parallel and provides us with the industrial scalability. LS-PLM model can capture nonlinear patterns from massive sparse data, saving us from heavy feature engineering jobs. Since 2012, LS-PLM has become the main CTR prediction model in Alibaba’s online display advertising system, serving hundreds of millions users every day.

