Goto

Collaborating Authors

 natural distribution shift






Effective Robustness against Natural Distribution Shifts for Models with Different Training Data

Neural Information Processing Systems

Existing effective robustness evaluations typically use a single test set such as ImageNet to evaluate the ID accuracy. This becomes problematic when evaluating models trained on different data distributions, e.g., comparing models trained on ImageNet vs. zero-shot language-image pre-trained models trained on LAION. In this paper, we propose a new evaluation metric to evaluate and compare the effective robustness of models trained on different data. To do this, we control for the accuracy on multiple ID test sets that cover the training distributions for all the evaluated models. Our new evaluation metric provides a better estimate of effective robustness when there are models with different training data. It may also explain the surprising effective robustness gains of zero-shot CLIP-like models exhibited in prior works that used ImageNet as the only ID test set, while the gains diminish under our new evaluation.






Review for NeurIPS paper: Measuring Robustness to Natural Distribution Shifts in Image Classification

Neural Information Processing Systems

I have to look more deeply into this but judging from a quick read their results do indeed change my perception on the performance gap in ImageNet-V2. Nevertheless I think ObjectNet is the more obvious example and should be front and center.