Geophysical Analysis & Survey
Deep Learning for Pavement Condition Evaluation Using Satellite Imagery
Lebaku, Prathyush Kumar Reddy, Gao, Lu, Lu, Pan, Sun, Jingran
Civil infrastructure systems covers large land areas and needs frequent inspections to maintain their public service capabilities. The conventional approaches of manual surveys or vehicle-based automated surveys to assess infrastructure conditions are often labor-intensive and time-consuming. For this reason, it is worthwhile to explore more cost-effective methods for monitoring and maintaining these infrastructures. Fortunately, recent advancements in satellite systems and image processing algorithms have opened up new possibilities. Numerous satellite systems have been employed to monitor infrastructure conditions and identify damages. Due to the improvement in ground sample distance (GSD), the level of detail that can be captured has significantly increased. Taking advantage of these technology advancement, this research investigated to evaluate pavement conditions using deep learning models for analyzing satellite images. We gathered over 3,000 satellite images of pavement sections, together with pavement evaluation ratings from TxDOT's PMIS database. The results of our study show an accuracy rate is exceeding 90%. This research paves the way for a rapid and cost-effective approach to evaluating the pavement network in the future.
Deploying Geospatial Foundation Models in the Real World: Lessons from WorldCereal
Butsko, Christina, Van Tricht, Kristof, Tseng, Gabriel, Milli, Giorgia, Rolnick, David, Cartuyvels, Ruben, Reshef, Inbal Becker, Szantoi, Zoltan, Kerner, Hannah
The increasing availability of geospatial foundation models has the potential to transform remote sensing applications such as land cover classification, environmental monitoring, and change detection. Despite promising benchmark results, the deployment of these models in operational settings is challenging and rare. Standardized evaluation tasks often fail to capture real-world complexities relevant for end-user adoption such as data heterogeneity, resource constraints, and application-specific requirements. This paper presents a structured approach to integrate geospatial foundation models into operational mapping systems. Our protocol has three key steps: defining application requirements, adapting the model to domain-specific data and conducting rigorous empirical testing. Using the Presto model in a case study for crop mapping, we demonstrate that fine-tuning a pre-trained model significantly improves performance over conventional supervised methods. Our results highlight the model's strong spatial and temporal generalization capabilities. Our protocol provides a replicable blueprint for practitioners and lays the groundwork for future research to operationalize foundation models in diverse remote sensing applications. Application of the protocol to the WorldCereal global crop-mapping system showcases the framework's scalability.
Referring Remote Sensing Image Segmentation with Cross-view Semantics Interaction Network
Yang, Jiaxing, Zhang, Lihe, Lu, Huchuan
Recently, Referring Remote Sensing Image Segmentation (RRSIS) has aroused wide attention. To handle drastic scale variation of remote targets, existing methods only use the full image as input and nest the saliency-preferring techniques of cross-scale information interaction into traditional single-view structure. Although effective for visually salient targets, they still struggle in handling tiny, ambiguous ones in lots of real scenarios. In this work, we instead propose a paralleled yet unified segmentation framework Cross-view Semantics Interaction Network (CSINet) to solve the limitations. Motivated by human behavior in observing targets of interest, the network orchestrates visual cues from remote and close distances to conduct synergistic prediction. In its every encoding stage, a Cross-View Window-attention module (CVWin) is utilized to supplement global and local semantics into close-view and remote-view branch features, finally promoting the unified representation of feature in every encoding stage. In addition, we develop a Collaboratively Dilated Attention enhanced Decoder (CDAD) to mine the orientation property of target and meanwhile integrate cross-view multiscale features. The proposed network seamlessly enhances the exploitation of global and local semantics, achieving significant improvements over others while maintaining satisfactory speed.
GraphVSSM: Graph Variational State-Space Model for Probabilistic Spatiotemporal Inference of Dynamic Exposure and Vulnerability for Regional Disaster Resilience Assessment
Dimasaka, Joshua, Geiร, Christian, So, Emily
Regional disaster resilience quantifies the changing nature of physical risks to inform policy instruments ranging from local immediate recovery to international sustainable development. While many existing state-of-practice methods have greatly advanced the dynamic mapping of exposure and hazard, our understanding of large-scale physical vulnerability has remained static, costly, limited, region-specific, coarse-grained, overly aggregated, and inadequately calibrated. With the significant growth in the availability of time-series satellite imagery and derived products for exposure and hazard, we focus our work on the equally important yet challenging element of the risk equation: physical vulnerability. We leverage machine learning methods that flexibly capture spatial contextual relationships, limited temporal observations, and uncertainty in a unified probabilistic spatiotemporal inference framework. We therefore introduce Graph Variational State-Space Model (GraphVSSM), a novel modular spatiotemporal approach that uniquely integrates graph deep learning, state-space modeling, and variational inference using time-series data and prior expert belief systems in a weakly supervised or coarse-to-fine-grained manner. We present three major results: a city-wide demonstration in Quezon City, Philippines; an investigation of sudden changes in the cyclone-impacted coastal Khurushkul community (Bangladesh) and mudslide-affected Freetown (Sierra Leone); and an open geospatial dataset, METEOR 2.5D, that spatiotemporally enhances the existing global static dataset for UN Least Developed Countries (2020). Beyond advancing regional disaster resilience assessment and improving our understanding global disaster risk reduction progress, our method also offers a probabilistic deep learning approach, contributing to broader urban studies that require compositional data analysis in weak supervision.
SU-ESRGAN: Semantic and Uncertainty-Aware ESRGAN for Super-Resolution of Satellite and Drone Imagery with Fine-Tuning for Cross Domain Evaluation
Generative Adversarial Networks (GANs) have achieved realistic super-resolution (SR) of images however, they lack semantic consistency and per-pixel confidence, limiting their credibility in critical remote sensing applications such as disaster response, urban planning and agriculture. This paper introduces Semantic and Uncertainty-Aware ESRGAN (SU-ESRGAN), the first SR framework designed for satellite imagery to integrate the ESRGAN, segmentation loss via DeepLabv3 for class detail preservation and Monte Carlo dropout to produce pixel-wise uncertainty maps. The SU-ESRGAN produces results (PSNR, SSIM, LPIPS) comparable to the Baseline ESRGAN on aerial imagery. This novel model is valuable in satellite systems or UAVs that use wide field-of-view (FoV) cameras, trading off spatial resolution for coverage. The modular design allows integration in UAV data pipelines for on-board or post-processing SR to enhance imagery resulting due to motion blur, compression and sensor limitations. Further, the model is fine-tuned to evaluate its performance on cross domain applications. The tests are conducted on two drone based datasets which differ in altitude and imaging perspective. Performance evaluation of the fine-tuned models show a stronger adaptation to the Aerial Maritime Drone Dataset, whose imaging characteristics align with the training data, highlighting the importance of domain-aware training in SR-applications.
IAMAP: Unlocking Deep Learning in QGIS for non-coders and limited computing resources
Tresson, Paul, Coz, Pierre Le, Tulet, Hadrien, Malkassian, Anthony, Mรฉchain, Maxime Rรฉjou
Remote sensing has entered a new era with the rapid development of artificial intelligence approaches. However, the implementation of deep learning has largely remained restricted to specialists and has been impractical because it often requires (i) large reference datasets for model training and validation; (ii) substantial computing resources; and (iii) strong coding skills. Here, we introduce IAMAP, a user-friendly QGIS plugin that addresses these three challenges in an easy yet flexible way. IAMAP builds on recent advancements in self-supervised learning strategies, which now provide robust feature extractors, often referred to as foundation models. These generalist models can often be reliably used in few-shot or zero-shot scenarios (i.e., with little to no fine-tuning). IAMAP's interface allows users to streamline several key steps in remote sensing image analysis: (i) extracting image features using a wide range of deep learning architectures; (ii) reducing dimensionality with built-in algorithms; (iii) performing clustering on features or their reduced representations; (iv) generating feature similarity maps; and (v) calibrating and validating supervised machine learning models for prediction. By enabling non-AI specialists to leverage the high-quality features provided by recent deep learning approaches without requiring GPU capacity or extensive reference datasets, IAMAP contributes to the democratization of computationally efficient and energy-conscious deep learning methods.
Tile and Slide : A New Framework for Scaling NeRF from Local to Global 3D Earth Observation
Billouard, Camille, Derksen, Dawa, Constantin, Alexandre, Vallet, Bruno
Neural Radiance Fields (NeRF) have recently emerged as a paradigm for 3D reconstruction from multiview satellite imagery. However, state-of-the-art NeRF methods are typically constrained to small scenes due to the memory footprint during training, which we study in this paper. Previous work on large-scale NeRFs palliate this by dividing the scene into NeRFs. This paper introduces Snake-NeRF, a framework that scales to large scenes. Our out-of-core method eliminates the need to load all images and networks simultaneously, and operates on a single device. We achieve this by dividing the region of interest into NeRFs that 3D tile without overlap. Importantly, we crop the images with overlap to ensure each NeRFs is trained with all the necessary pixels. We introduce a novel $2\times 2$ 3D tile progression strategy and segmented sampler, which together prevent 3D reconstruction errors along the tile edges. Our experiments conclude that large satellite images can effectively be processed with linear time complexity, on a single GPU, and without compromise in quality.
Satellite Federated Fine-Tuning for Foundation Models in Space Computing Power Networks
Zhu, Yan, Zhu, Jingyang, Wang, Ting, Shi, Yuanming, Jiang, Chunxiao, Letaief, Khaled Ben
Advancements in artificial intelligence (AI) and low-earth orbit (LEO) satellites have promoted the application of large remote sensing foundation models for various downstream tasks. However, direct downloading of these models for fine-tuning on the ground is impeded by privacy concerns and limited bandwidth. Satellite federated learning (FL) offers a solution by enabling model fine-tuning directly on-board satellites and aggregating model updates without data downloading. Nevertheless, for large foundation models, the computational capacity of satellites is insufficient to support effective on-board fine-tuning in traditional satellite FL frameworks. To address these challenges, we propose a satellite-ground collaborative federated fine-tuning framework. The key of the framework lies in how to reasonably decompose and allocate model components to alleviate insufficient on-board computation capabilities. During fine-tuning, satellites exchange intermediate results with ground stations or other satellites for forward propagation and back propagation, which brings communication challenges due to the special communication topology of space transmission networks, such as intermittent satellite-ground communication, short duration of satellite-ground communication windows, and unstable inter-orbit inter-satellite links (ISLs). To reduce transmission delays, we further introduce tailored communication strategies that integrate both communication and computing resources. Specifically, we propose a parallel intra-orbit communication strategy, a topology-aware satellite-ground communication strategy, and a latency-minimalization inter-orbit communication strategy to reduce space communication costs. Simulation results demonstrate significant reductions in training time with improvements of approximately 33%.
DeepC4: Deep Conditional Census-Constrained Clustering for Large-scale Multitask Spatial Disaggregation of Urban Morphology
Dimasaka, Joshua, Geiร, Christian, So, Emily
To understand our global progress for sustainable development and disaster risk reduction in many developing economies, two recent major initiatives - the Uniform African Exposure Dataset of the Global Earthquake Model (GEM) Foundation and the Modelling Exposure through Earth Observation Routines (METEOR) Project - implemented classical spatial disaggregation techniques to generate large-scale mapping of urban morphology using the information from various satellite imagery and its derivatives, geospatial datasets of the built environment, and subnational census statistics. However, the local discrepancy with well-validated census statistics and the propagated model uncertainties remain a challenge in such coarse-to-fine-grained mapping problems, specifically constrained by weak and conditional label supervision. Therefore, we present Deep Conditional Census-Constrained Clustering (DeepC4), a novel deep learning-based spatial disaggregation approach that incorporates local census statistics as cluster-level constraints while considering multiple conditional label relationships in a joint multitask learning of the patterns of satellite imagery. To demonstrate, compared to GEM and METEOR, we enhanced the quality of Rwandan maps of urban morphology, specifically building exposure and physical vulnerability, at the third-level administrative unit from the 2022 census. As the world approaches the conclusion of our global frameworks in 2030, our work has offered a new deep learning-based mapping technique towards a spatial auditing of our existing coarse-grained derived information at large scales.
AI-ming backwards: Vanishing archaeological landscapes in Mesopotamia and automatic detection of sites on CORONA imagery
Pistola, Alessandro, Orru', Valentina, Marchetti, Nicolo', Roccetti, Marco
By upgrading an existing deep learning model with the knowledge provided by one of the oldest sets of grayscale satellite imagery, known as CORONA, we improved the AI model's attitude towards the automatic identification of archaeological sites in an envir onment which has been completely transformed in the last five decades, including the complete destruction of many of those same sites. The initial Bing - based convolutional network model was re - trained using CORONA satellite imagery for the district of Abu Ghraib, west of Baghdad, central Mesopotamian floodplain. The results were twofold and surprising. First, the detection precision obtained on the area of interest increased sensibly: in particular, the Intersection - over - Union (IoU) values, at the image segmentation level, surpassed 85%, while the general accuracy in detecting archeological sites reached 90%. Second, our re - trained model allowed the identification of four new sites of archaeological interest (confirmed through field verification), previously not identified by archaeologists with traditional techniques. This has confirmed the efficacy of using AI techniques and the CORONA imagery from the 1960s to discover archaeological sites currently no longer visible, a concrete breakthrough with significant consequences for the study of landscapes with vanishing archaeological evidence induced by anthropization.