Similarity Guided Multimodal Fusion Transformer for Semantic Location Prediction in Social Media