Goto

Collaborating Authors

 reviewer4


ea89621bee7c88b2c5be6681c8ef4906-AuthorFeedback.pdf

Neural Information Processing Systems

In contrast, we use 10% of the training set9 for validation, and treat the validation set as apurely held-out test set (this also means that we train on less data).10 Wewillexplainthismoreclearly.30 both spheres are sufficiently tiny (i.e.


464074179972cbbd75a39abc6954cd12-AuthorFeedback.pdf

Neural Information Processing Systems

We are grateful to the reviewers for the insightful comments on our submission. Tothisend, weintended toremove53 any over-fitting effect by using a dense training set for clear illustration in the first and fourth columns in Figure 2.54 (ii) 4.2 istoshowhowtheNLL helps alleviate theover-fitting issue, where thevalidation setwith 10,000samples55 were generated uniformly inΩ.