On the Importance of Building High-quality Training Datasets for Neural Code Search