Towards Understanding the Role of Sharpness-Aware Minimization Algorithms for Out-of-Distribution Generalization