Reviews: Multi-Layered Gradient Boosting Decision Trees

Neural Information Processing Systems 

Short overview: Authors propose to build a neural network using gradient boosted trees as components in the layers. To train such a structure, since the gbdts are not able to propagate the gradient, they propose to use a method inspired by the target propagation: each gradient boosted trees is built to approximate a gradient of loss of prediction function and a pseudo target, with respect to the prediction function. Pseudo targets are updated at each iteration using the reverse mapping of the built tree representation and the pseudo label of the next layer. The reverse mapping can be found using the reconstruction loss. At each iteration, each layer's ensemble grows by one boosting tree Authors hint at potential applications of blocking adversarial attacks, that rely on estimating the gradients of the final loss with respect to input, which would not work for layers that can't propagate the gradients, however this direction is not explored in this paper Detailed comments: Overall, an interesting idea of co-training gbdts with nns.