Reviews: The Parallel Knowledge Gradient Method for Batch Bayesian Optimization

Neural Information Processing Systems 

The paper is well written and easy to follow. Parallelization of BO is an important subject for practical hyperparameter optimization and the proposed approach is interesting and more elegant than most existing approaches I am aware of. The fact a Bayes-optimal batch is determined is very promising. The authors assume independent normally distributed errors, which is common in most BO methods based on Gaussian processes. However, in hyperparameter optimization this assumption is problematic, since measurements errors represent the difference between generalization performance and empirical estimates (e.g., through cross-validation).