Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction