Self-supervised Semantic Segmentation Grounded in Visual Concepts