Goto

Collaborating Authors

 chosenfrom


5bacb12bf81e98e2ee0eed953a23c656-Paper-Conference.pdf

Neural Information Processing Systems

Instead,ourboundrequires asimple, intuitive condition which is well justified by prior empirical works and holds in practiceeffectively100%ofthetime. Theboundisinspiredby H H-divergence but is easier to evaluate and substantially tighter, consistently providing nonvacuous test error upper bounds.


747d3443e319a22747fbb873e8b2f9f2-Supplemental.pdf

Neural Information Processing Systems

It can be derived that the posterior processf|O is also a GP, we denote its mean function and21 kernel function asµn and κn respectively. To reduce the time consumption and take advantage of parallelization, we train several different32 networks at a time. When selecting the first BSSC, equation 2 can be used directly. Therefore, we use the expectedvalue of EI function (EEI, [4])instead. ResNet18/50 consists of 6 stages as illustrated in Figure 1.