Review for NeurIPS paper: Interpretable and Personalized Apprenticeship Scheduling: Learning Interpretable Scheduling Policies from Heterogeneous User Demonstrations

Neural Information Processing Systems 

Weaknesses: A major point of concern is the value of the user study in comparing the interpretability of the scheduling decision tree to that of a neural network. While there is nothing wrong with the study methodology itself, or subsequent statistical analysis, it is not clear that the user study is really capturing the difference in interpretability between the neural network and the decision tree. IT seems that the study may not really be testing the hypotheses (H1, H2, H3) that the authors claim it is. The study asks participants to manually compute the outputs of decision trees and neural networks, given their decision thresholds (for trees) or weights and biases (for networks). It is not surprising that it is easier to manually execute a decision tree, as many fewer operations are needed, but this question seems spurious. In practice, we would never expect a user to manually compute the output of a neural network (or a large decision tree for that matter).