Review for NeurIPS paper: Decision trees as partitioning machines to characterize their generalization properties

Neural Information Processing Systems 

Weaknesses: * From the theoretical analysis, the main weakness might be the analysis on pure continuous features. Nowadays, it is very unlikely to have this scenario in the most challenging machine learning problems. Thus, the theoretical implications can be limited due to this. From the empirical evaluations, I am curious to know why the method was only compared to a very old algorithm such as CART. Is it the only reasonable algorithm out there for decision trees that can be comparable to the method proposed?