Supplementary Material A Distances and divergences for quantifying domain shift 15 A.1 The Wasserstein distance

Neural Information Processing Systems 

Besides analyzing the performance drop when evaluating a model using source statistics on a target dataset, we consider the mismatch in model statistics directly. We first take an ImageNet trained model and adapt it to each of the 95 conditions in IN-C.