geospatial data
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding
Feng, Jie, Wang, Shengyuan, Liu, Tianhui, Xi, Yanxin, Li, Yong
Urban research involves a wide range of scenarios and tasks that require the understanding of multi-modal data. Current methods often focus on specific data types and lack a unified framework in urban field for processing them comprehensively. The recent success of multi-modal large language models (MLLMs) presents a promising opportunity to overcome this limitation. In this paper, we introduce $\textit{UrbanLLaVA}$, a multi-modal large language model designed to process these four types of data simultaneously and achieve strong performance across diverse urban tasks compared with general MLLMs. In $\textit{UrbanLLaVA}$, we first curate a diverse urban instruction dataset encompassing both single-modal and cross-modal urban data, spanning from location view to global view of urban environment. Additionally, we propose a multi-stage training framework that decouples spatial reasoning enhancement from domain knowledge learning, thereby improving the compatibility and downstream performance of $\textit{UrbanLLaVA}$ across diverse urban tasks. Finally, we also extend existing benchmark for urban research to assess the performance of MLLMs across a wide range of urban tasks. Experimental results from three cities demonstrate that $\textit{UrbanLLaVA}$ outperforms open-source and proprietary MLLMs in both single-modal tasks and complex cross-modal tasks and shows robust generalization abilities across cities. Source codes and data are openly accessible to the research community via https://github.com/tsinghua-fib-lab/UrbanLLaVA.
- Transportation > Ground > Road (0.68)
- Health & Medicine (0.67)
- Transportation > Infrastructure & Services (0.46)
- Education > Educational Setting (0.46)
From Bias to Accountability: How the EU AI Act Confronts Challenges in European GeoAI Auditing
Matuszczyk, Natalia, Barnes, Craig R., Gupta, Rohit, Ozel, Bulent, Mitra, Aniket
Bias in geospatial artificial intelligence (GeoAI) models has been documented, yet the evidence is scattered across narrowly focused studies. We synthesize this fragmented literature to provide a concise overview of bias in GeoAI and examine how the EU's Artificial Intelligence Act (EU AI Act) shapes audit obligations. We discuss recurring bias mechanisms, including representation, algorithmic and aggregation bias, and map them to specific provisions of the EU AI Act. By applying the Act's high-risk criteria, we demonstrate that widely deployed GeoAI applications qualify as high-risk systems. We then present examples of recent audits along with an outline of practical methods for detecting bias. As far as we know, this study represents the first integration of GeoAI bias evidence into the EU AI Act context, by identifying high-risk GeoAI systems and mapping bias mechanisms to the Act's Articles. Although the analysis is exploratory, it suggests that even well-curated European datasets should employ routine bias audits before 2027, when the AI Act's high-risk provisions take full effect.
- South America (0.04)
- North America > United States > Pennsylvania (0.04)
- Europe > Middle East (0.04)
- (8 more...)
- Research Report (1.00)
- Overview (0.93)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- (4 more...)
Towards Scalable Foundation Model for Multi-modal and Hyperspectral Geospatial Data
Si, Haozhe, Wan, Yuxuan, Do, Minh, Vasisht, Deepak, Zhao, Han, Hamann, Hendrik F.
Geospatial raster data, such as that collected by satellite-based imaging systems at different times and spectral bands, hold immense potential for enabling a wide range of high-impact applications. This potential stems from the rich information that is spatially and temporally contextualized across multiple channels and sensing modalities. Recent work has adapted existing self-supervised learning approaches for such geospatial data. However, they fall short of scalable model architectures, leading to inflexibility and computational inefficiencies when faced with an increasing number of channels and modalities. To address these limitations, we introduce Low-rank Efficient Spatial-Spectral Vision Transformer with three key innovations: i) the LESS Attention Block that approximates high-dimensional spatial-spectral attention through Kronecker's product of the low-dimensional spatial and spectral attention components; ii) the Continuous Positional-Channel Embedding Layer that preserves both the continuity and physical characteristics of each spatial-spectral patch; and iii) the Perception Field Mask that exploits local spatial dependencies by constraining attention to neighboring patches. To evaluate the proposed innovations, we construct GFM-Bench, which serves as a comprehensive benchmark for such geospatial raster data. We pretrain LESS ViT using a Hyperspectral Masked Autoencoder framework with integrated positional and channel masking strategies. Experimental results demonstrate that our proposed method achieves competitive performance against state-of-the-art multi-modal geospatial foundation models while outperforming them on cross-satellite generalization tasks with higher computational efficiency. The flexibility and extensibility of our framework make it a promising direction for future geospatial data analysis tasks that involve a wide range of modalities and channels.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > Illinois > Champaign County > Champaign (0.04)
GeoJEPA: Towards Eliminating Augmentation- and Sampling Bias in Multimodal Geospatial Learning
Lundqvist, Theodor, Delvret, Ludvig
Existing methods for self-supervised representation learning of geospatial regions and map entities rely extensively on the design of pretext tasks, often involving augmentations or heuristic sampling of positive and negative pairs based on spatial proximity. This reliance introduces biases and limits the representations' expressiveness and generalisability. Consequently, the literature has expressed a pressing need to explore different methods for modelling geospatial data. To address the key difficulties of such methods, namely multimodality, heterogeneity, and the choice of pretext tasks, we present GeoJEPA, a versatile multimodal fusion model for geospatial data built on the self-supervised Joint-Embedding Predictive Architecture. With GeoJEPA, we aim to eliminate the widely accepted augmentation- and sampling biases found in self-supervised geospatial representation learning. GeoJEPA uses self-supervised pretraining on a large dataset of OpenStreetMap attributes, geometries and aerial images. The results are multimodal semantic representations of urban regions and map entities that we evaluate both quantitatively and qualitatively. Through this work, we uncover several key insights into JEPA's ability to handle multimodal data.
- Asia > China > Beijing > Beijing (0.04)
- Europe > Switzerland (0.04)
- North America > United States > Virginia (0.04)
- (15 more...)
- Research Report (1.00)
- Overview (0.67)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
MapQaTor: A System for Efficient Annotation of Map Query Datasets
Dihan, Mahir Labib, Ali, Mohammed Eunus, Parvez, Md Rizwan
Mapping and navigation services like Google Maps, Apple Maps, Openstreet Maps, are essential for accessing various location-based data, yet they often struggle to handle natural language geospatial queries. Recent advancements in Large Language Models (LLMs) show promise in question answering (QA), but creating reliable geospatial QA datasets from map services remains challenging. We introduce MapQaTor, a web application that streamlines the creation of reproducible, traceable map-based QA datasets. With its plug-and-play architecture, MapQaTor enables seamless integration with any maps API, allowing users to gather and visualize data from diverse sources with minimal setup. By caching API responses, the platform ensures consistent ground truth, enhancing the reliability of the data even as real-world information evolves. MapQaTor centralizes data retrieval, annotation, and visualization within a single platform, offering a unique opportunity to evaluate the current state of LLM-based geospatial reasoning while advancing their capabilities for improved geospatial understanding. Evaluation metrics show that, MapQaTor speeds up the annotation process by at least 30 times compared to manual methods, underscoring its potential for developing geospatial resources, such as complex map reasoning datasets. The website is live at: https://mapqator.github.io/ and a demo video is available at: https://youtu.be/7_aV9Wmhs6Q.
- Asia > South Korea (0.04)
- Asia > Middle East > Qatar (0.04)
- Asia > Bangladesh (0.04)
Sims: An Interactive Tool for Geospatial Matching and Clustering
Zaytar, Akram, Tadesse, Girmaw Abebe, Robinson, Caleb, Bendito, Eduardo G., Devare, Medha, Chernet, Meklit, Hacheme, Gilles Q., Dodhia, Rahul, Ferres, Juan M. Lavista
Acquiring, processing, and visualizing geospatial data requires significant computing resources, especially for large spatio-temporal domains. This challenge hinders the rapid discovery of predictive features, which is essential for advancing geospatial modeling. To address this, we developed Similarity Search (Sims), a no-code web tool that allows users to perform clustering and similarity search over defined regions of interest using Google Earth Engine as a backend. Sims is designed to complement existing modeling tools by focusing on feature exploration rather than model creation. We demonstrate the utility of Sims through a case study analyzing simulated maize yield data in Rwanda, where we evaluate how different combinations of soil, weather, and agronomic features affect the clustering of yield response zones. Sims is open source and available at https://github.com/microsoft/Sims
Interview with Andrews Ata Kangah: Localising illegal mining sites using machine learning and geospatial data
Andrews Ata Kangah is a team leader and researcher working on democratizing AI and AI solutions for environmental problems. We spoke to him about his research, attending the AfriClimate AI workshop at the Deep Learning Indaba, and what inspired him to work in AI and on climate-related projects. My name is Andrews Ata Kangah. I also double as a researcher at Armtos, which is a non-profit. At Armtos, our current goal is to build a solution to solve the illegal mining problem that's going on in Ghana. The mining is destroying the lands that are within mining areas.
- Africa > Ghana (0.29)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
- North America > Canada (0.05)
- Africa > Senegal (0.05)
Geospatial foundation models for image analysis: evaluating and enhancing NASA-IBM Prithvi's domain adaptability
Hsu, Chia-Yu, Li, Wenwen, Wang, Sizhe
Research on geospatial foundation models (GFMs) has become a trending topic in geospatial artificial intelligence (AI) research due to their potential for achieving high generalizability and domain adaptability, reducing model training costs for individual researchers. Unlike large language models, such as ChatGPT, constructing visual foundation models for image analysis, particularly in remote sensing, encountered significant challenges such as formulating diverse vision tasks into a general problem framework. This paper evaluates the recently released NASA-IBM GFM Prithvi for its predictive performance on high-level image analysis tasks across multiple benchmark datasets. Prithvi was selected because it is one of the first open-source GFMs trained on time-series of high-resolution remote sensing imagery. A series of experiments were designed to assess Prithvi's performance as compared to other pre-trained task-specific AI models in geospatial image analysis. New strategies, including band adaptation, multi-scale feature generation, and fine-tuning techniques, are introduced and integrated into an image analysis pipeline to enhance Prithvi's domain adaptation capability and improve model performance. In-depth analyses reveal Prithvi's strengths and weaknesses, offering insights for both improving Prithvi and developing future visual foundation models for geospatial tasks.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Arizona > Maricopa County > Tempe (0.04)
- Europe > Denmark (0.04)
- (5 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.88)
An Autonomous GIS Agent Framework for Geospatial Data Retrieval
Ning, Huan, Li, Zhenlong, Akinboyewa, Temitope, Lessani, M. Naser
Abstract: Powered by the emerging large language models (LLMs), autonomous geographic information systems (GIS) agents have the potential to accomplish spatial analyses and cartographic tasks. However, a research gap exists to support fully autonomous GIS agents: how to enable agents to discover and download the necessary data for geospatial analyses. This study proposes an autonomous GIS agent framework capable of retrieving required geospatial data by generating, executing, and debugging programs. The framework utilizes the LLM as the decision-maker, selects the appropriate data source (s) from a pre-defined source list, and fetches the data from the chosen source. Each data source has a handbook that records the metadata and technical details for data retrieval. The proposed framework is designed in a plug-and-play style to ensure flexibility and extensibility. Human users or autonomous data scrawlers can add new data sources by adding new handbooks. We developed a prototype agent based on the framework, released as a QGIS plugin (GeoData Retrieve Agent) and a Python program. Experiment results demonstrate its capability of retrieving data from various sources including OpenStreetMap, administrative boundaries and demographic data from the US Census Bureau, satellite basemaps from ESRI World Imagery, global digital elevation model (DEM) from OpenTopography.org, Our study is among the first attempts to develop an autonomous geospatial data retrieval agent. Keywords: autonomous GIS; geospatial data retrieval; large language models; generative AI; GIS agent; AI assistant 1 Introduction In recent years, large language models (LLMs) have drawn tremendous attention from researchers.
- North America > United States > Pennsylvania (0.05)
- North America > United States > Idaho > Boundary County (0.05)
- North America > Puerto Rico (0.05)
- (7 more...)
Quantifying Geospatial in the Common Crawl Corpus
Ilyankou, Ilya, Wang, Meihui, Haworth, James, Cavazzi, Stefano
Large language models (LLMs) exhibit emerging geospatial capabilities, stemming from their pre-training on vast unlabelled text datasets that are often derived from the Common Crawl corpus. However, the geospatial content within CC remains largely unexplored, impacting our understanding of LLMs' spatial reasoning. This paper investigates the prevalence of geospatial data in recent Common Crawl releases using Gemini, a powerful language model. By analyzing a sample of documents and manually revising the results, we estimate that between 1 in 5 and 1 in 6 documents contain geospatial information such as coordinates and street addresses. Our findings provide quantitative insights into the nature and extent of geospatial data within Common Crawl, and web crawl data in general. Furthermore, we formulate questions to guide future investigations into the geospatial content of available web crawl datasets and its influence on LLMs.
- North America > United States > District of Columbia > Washington (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- South America > Uruguay > Montevideo > Montevideo (0.04)
- (9 more...)
- Research Report (1.00)
- Overview (0.87)