BMoreExperimentalSetups
–Neural Information Processing Systems
Example Reweightingdirectly assigns an importance weight to the standard CE training loss, accordingtothebiasdegreeβ: Lreweight = (1 β)y logpm (3) Confidence Regularizationis based on knowledge distillation [9]. It involves a teacher model trainedwiththestandardCEloss. Specifically, we calculate the weighted average of the F1 score of each class. The splits used for evaluation are highlightedwithredcolor. To address this problem, we select the best checkpoint after0.7 tmax of training, butstill according to the performance on the ID devset.
Neural Information Processing Systems
Feb-10-2026, 00:35:45 GMT
- Technology: