Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction