Goto

Collaborating Authors

 Oceania




Appendixfor" Self-InterpretableModelwith TransformationEquivariant Interpretation "

Neural Information Processing Systems

Please refer to the Appendix 5 for details. Besides, in order to balance the classification loss and the transformation loss we set the scalar factor to beλ = 5 throughoutthetrainingphase. Here the first rows are the untransformed and the transformed images, while the second rows are the corresponding interpretations. This is a supplement toFigure 1 in the main body of the paper. And also there are perturbation methods such as randomized input sampling (RISE) [8] and extremal perturbation (EP) [2].



ExplicitEigenvalueRegularizationImproves Sharpness-AwareMinimization

Neural Information Processing Systems

Sharpness-Aware Minimization (SAM) has attracted significant attention for its effectiveness in improving generalization across various tasks. However, its underlying principles remain poorly understood.





0e915db6326b6fb6a3c56546980a8c93-Supplemental.pdf

Neural Information Processing Systems

Let B be the maximum difference betweenU1t and U2t, and let (π,θ1,θ2) be a Nash Equilibrium forG. Let π1 be the best response to the first teacher (with utilityU1t) and let π1+2 be the best response policy to the joint teacher. This result shows that as we reduce the number of random episodes, the approximation to aminimax regret strategy improves. Let G be the dual curriculum game in which the first teacher maximizes regret, so U1t = URt, and the second teacher plays randomly, soU2t = UUt . Finally,we need to show thatπ2+3 isoptimal for the student.