Reviews: Optimal Sparse Decision Trees

Neural Information Processing Systems 

Originality: Training of optimal decision trees is clearly a problem that has seen a lot of prior work. A distinguishing feature of this submission is that it focuses on optimal *sparse* decision trees for binary variables, and that the approach seems to be feasible in practice, which is achieved by a combination of analytical bounds that reduce the search space as well as efficient implementation techniques. The work builds upon the CORLES algorithm and its approach to creating optimal decision lists. However, the authors extend this approach to decision trees in a non-trivial manner that adds substantial novelty. Quality: The claims of the paper are very well supported by theoretical analysis as well as experiments.