Review for NeurIPS paper: Bayesian Optimization for Iterative Learning

Neural Information Processing Systems 

The paper proposes an idea for tuning hyper-parameters in deep (reinforcement) learning using Bayesian optimization. The key idea is to exploit the iterative structure of the problem and use a variable-augmentation trick to learn a score function that compresses the learning progress at any stage. The strengths of the paper are: - well written - good relation to prior work - good experimental study However, the paper also has weaknesses, which are mostly related to theoretical aspects and chosen heuristics (see some details below). If we are only interested in the predictive mean for the cost-GP, why do we use a GP in the first place, and not parametric function, which scales much better? That's the one part that caused us the most toothache.