its semantic meaning: for all points with ALICE score of p, we expect p
–Neural Information Processing Systems
The authors would like to thank the reviewers for their thoughtful comments. We have replaced the calibration experiment. Figure 1: ALICE score calibration of ResNet32 trained on CIFAR10. At 50 epochs we reach max validation accuracy. Full experimental details are in the final version.
Neural Information Processing Systems
Oct-3-2025, 08:57:32 GMT