Review for NeurIPS paper: The Diversified Ensemble Neural Network

Neural Information Processing Systems 

In other words, I am wondering how we can make sure the improvements presented by using DEns-NN contributes to generalization? One possible suggestion could be tracking L_d (diversity loss function) during training on both training data and some validation data; it can be insightful to see how much of error reduction is correlated/due to L_d. 3- In the same line, I am also wondering how the hyperparameters in R-Forest, XGBoost, NN are set? Are they different across different datasets? Are they set individually for each dataset via a validation set? 4- The paper is written clearly; however, it can improve by revisiting the use of notations; for example N is used for different purposes. Also, there are some typos.