What Variables Affect Out-Of-Distribution Generalization in Pretrained Models?