StableMask: Refining Causal Masking in Decoder-only Transformer