MUSTAN: Multi-scale Temporal Context as Attention for Robust Video Foreground Segmentation