Efficient Data Selection for Training Genomic Perturbation Models