Guiding Masked Representation Learning to Capture Spatio-Temporal Relationship of Electrocardiogram