Two-Step Sound Source Separation: Training on Learned Latent Targets