Spatial-Aware Feature Aggregation for Image based Cross-View Geo-Localization
Yujiao Shi, Liu Liu, Xin Yu, Hongdong Li
–Neural Information Processing Systems
Recent works show that it is possible to train a deep network to determine the geographic location of a ground-level image (e.g., a Google street-view panorama) by matching it against a satellite map covering the wide geographic area of interest. Conventional deep networks, which often cast the problem as a metric embedding task, however, suffer from poor performance in terms of low recall rates. One of the key reasons is the vast differences between the two view modalities, i.e., ground view versus aerial/satellite view. They not only exhibit very different visual appearances, but also have distinctive geometric configurations. Existing deep methods overlook those appearance and geometric differences, and instead use a brute force training procedure, leading to inferior performance.
Neural Information Processing Systems
Jan-26-2025, 16:46:50 GMT