Data Augmentations for Improved (Large) Language Model Generalization