01d8bae291b1e4724443375634ccfa0e-AuthorFeedback.pdf

Neural Information Processing Systems 

To the6 best of our knowledge, this work for the first time introduces and formally defines AV. Herewegivetwoexamples (moredetailscanbe10 found in Sec 6.1 and 6.2): (1) Recent work [25,5,51] found that stochastic weight averaging (SWA) over iterations11 leads toHIGHERTRAININGLOSS butlowertesterror. What about 2-D loss surfaces? Apart20 from the SOTA networks stated in our paper, we also conduct experiments21 on a simple MLP in Appendix.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found