Guided contrastive self-supervised pre-training for automatic speech recognition