Goto

Collaborating Authors

 str hyper



given the time-and space-bounded aspects of the rebuttal, hoping we clarified the main questions of the reviewers

Neural Information Processing Systems

We thank the four reviewers for their insightful comments and suggestions. I looked into the paper in ref[12] . . . ": In [12], the greedy algorithm is generic, with no assumptions about models ": Random search leads to a set of For Tab. 1, we ran the Wilcoxon signed-rank test (paired along settings, datasets and model types) and For Tab. 2 (with more costly experiments), we do not have enough runs to apply such We nonetheless report the standard errors in the paper, which seem to indicate significant improvements. ": Those numbers indicate the size of the ensemble; we will clarify this point. ": We thank R1 for the idea and ran our entire benchmark for ResNet-20: ": Hyper ensembles can indeed be viewed as a mixture They typically use Bayes nonparametric priors/posteriors and MCMC; we use mixtures and SGD. ": When used with replacement, the greedy algorithm from Caruana et al. [12, Sec.