Towards Interpretable Geo-localization: a Concept-Aware Global Image-GPS Alignment Framework

Jia, Furong, Liu, Lanxin, Hou, Ce, Zhang, Fan, Liu, Xinyan, Liu, Yu

Sep-8-2025–arXiv.org Artificial Intelligence

Worldwide geo-localization involves determining the exact geographic location of images captured globally, typically guided by geographic cues such as climate, landmarks, and architectural styles. Despite advancements in geo-localization models like GeoCLIP, which leverages images and location alignment via contrastive learning for accurate predictions, the interpretability of these models remains insufficiently explored. Current concept-based interpretability methods fail to align effectively with Geo-alignment image-location embedding objectives, resulting in suboptimal inter-pretability and performance. To address this gap, we propose a novel framework integrating global geo-localization with concept bottlenecks. Our method inserts a Concept-A ware Alignment Module that jointly projects image and location embeddings onto a shared bank of geographic concepts (e.g., tropical climate, mountain, cathedral) and minimizes a concept-level loss, enhancing alignment in a concept-specific subspace and enabling robust interpretability. To our knowledge, this is the first work to introduce interpretability into geo-localization. Extensive experiments demonstrate that our approach surpasses GeoCLIP in geo-localization accuracy and boosts performance across diverse geospatial prediction tasks, revealing richer semantic insights into geographic decision-making processes.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

Sep-8-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)
- Asia > China (0.29)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (0.94)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Natural Language > Large Language Model (0.94)
    - Machine Learning > Neural Networks
      - Deep Learning (0.93)