Improving Transformer-based Networks With Locality For Automatic Speaker Verification