self-1 distillation (SD) and label-smoothing (LS) as MAP insightful ([R2], [R3], [R4]), that relating accuracy to confidence
–Neural Information Processing Systems
We thank all reviewers for their constructive feedback! We address reviewers comments below, and will incorporate all feedback. This explains why SD outperforms LS. Please refer to our response to [R3] for discussion on CD. One can alternatively compute the variance of prediction confidence.
Neural Information Processing Systems
Oct-2-2025, 06:11:38 GMT
- Technology: