Sustainable self-supervised learning for speech representations