On the Connection between Pre-training Data Diversity and Fine-tuning Robustness