Data Selection for Language Models via Importance Resampling