Reviews: A Communication-Efficient Parallel Algorithm for Decision Tree

Neural Information Processing Systems 

Given the popularity of decision trees, proposing an efficient parallel implementation of this method is of course very relevant. The proposed parallelization is original with respect to existing methods and it should indeed lead to less communications than other methods. The theoretical analysis is sound and I like the discussion of the impact of the main problem and method parameters that follows from the lower bound provided in theorem 4.1. Experiments are conducted on two very large problems, where, in the limit of the tested settings (see below), PV-tree is clearly shown to outperform other parallel implementations, in terms of both computing times to reach a given accuracy level and communication costs. I nevertheless have two major concerns with the proposed parallelization.