Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations