Goto

Collaborating Authors

 ticket






AnalyzingLotteryTicketHypothesisfrom PAC-BayesianTheoryPerspective

Neural Information Processing Systems

However,sincetheinitial large learning rate generally helps the optimizer to converge to flatter minima, we hypothesize that the winning tickets have relatively sharp minima, which is considered a disadvantage in terms of generalization ability.