Assessing Robustness to Spurious Correlations in Post-Training Language Models

Open in new window