Induction Network: Audio-Visual Modality Gap-Bridging for Self-Supervised Sound Source Localization