Assessing Robustness to Spurious Correlations in Post-Training Language Models