Goto

Collaborating Authors

 cutmix




Appendix A Further Empirical Studies

Neural Information Processing Systems

As reported in Table A3, PS-MT consistently shows lower distances than Dual Teacher shows. The STD is similarly between 2 and over 50 times smaller. PS-MT's teachers (albeit they may have distinct characteristics) potentially becomes similar distances to the student at each epoch. Comparative analysis of performance based on different CutMix variations. We further report additional quantitative results encompassing three different splits: original high-quality set, blended set, and blended high-quality set .





OntheEffectivenessofLipschitz-DrivenRehearsal inContinualLearning-SupplementaryMaterial

Neural Information Processing Systems

If α > β, we are overemphasizing the contribution of the first term of Eq. 9 (which brings each layer'sλk1 andck close toeach other) overthesecond one(which induces small Lipschitz targets).