Goto

Collaborating Authors

 Geophysical Analysis & Survey


Deep Semantic-Visual Alignment for Zero-Shot Remote Sensing Image Scene Classification

arXiv.org Artificial Intelligence

Deep neural networks have achieved promising progress in remote sensing (RS) image classification, for which the training process requires abundant samples for each class. However, it is time-consuming and unrealistic to annotate labels for each RS category, given the fact that the RS target database is increasing dynamically. Zero-shot learning (ZSL) allows for identifying novel classes that are not seen during training, which provides a promising solution for the aforementioned problem. However, previous ZSL models mainly depend on manually-labeled attributes or word embeddings extracted from language models to transfer knowledge from seen classes to novel classes. Besides, pioneer ZSL models use convolutional neural networks pre-trained on ImageNet, which focus on the main objects appearing in each image, neglecting the background context that also matters in RS scene classification. To address the above problems, we propose to collect visually detectable attributes automatically. We predict attributes for each class by depicting the semantic-visual similarity between attributes and images. In this way, the attribute annotation process is accomplished by machine instead of human as in other methods. Moreover, we propose a Deep Semantic-Visual Alignment (DSVA) that take advantage of the self-attention mechanism in the transformer to associate local image regions together, integrating the background context information for prediction. The DSVA model further utilizes the attribute attention maps to focus on the informative image regions that are essential for knowledge transfer in ZSL, and maps the visual images into attribute space to perform ZSL classification. With extensive experiments, we show that our model outperforms other state-of-the-art models by a large margin on a challenging large-scale RS scene classification benchmark.


Climate Trends of Tropical Cyclone Intensity and Energy Extremes Revealed by Deep Learning

arXiv.org Artificial Intelligence

Anthropogenic influences have been linked to tropical cyclone (TC) poleward migration, TC extreme precipitation, and an increased proportion of major hurricanes [1, 2, 3, 4]. Understanding past TC trends and variability is critical for projecting future TC impacts on human society considering the changing climate [5]. However, past trends of TC structure/energy remain uncertain due to limited observations; subjective-analyzed and spatiotemporal-heterogeneous "best-track" datasets lead to reduced confidence in the assessed TC repose to climate change [6, 7]. Here, we use deep learning to reconstruct past "observations" and yield an objective global TC wind profile dataset during 1981 to 2020, facilitating a comprehensive examination of TC structure/energy. By training with uniquely labeled data integrating best tracks and numerical model analysis of 2004 to 2018 TCs, our model converts multichannel satellite imagery to a 0-750-km wind profile of axisymmetric surface winds. The model performance is verified to be sufficient for climate studies by comparing it to independent satellite-radar surface winds. Based on the new homogenized dataset, the major TC proportion has increased by ~13% in the past four decades. Moreover, the proportion of extremely high-energy TCs has increased by ~25%, along with an increasing trend (> one standard deviation of the 40-y variability) of the mean total energy of high-energy TCs. Although the warming ocean favors TC intensification, the TC track migration to higher latitudes and altered environments further affect TC structure/energy. This new deep learning method/dataset reveals novel trends regarding TC structure extremes and may help verify simulations/studies regarding TCs in the changing climate.


SugarViT -- Multi-objective Regression of UAV Images with Vision Transformers and Deep Label Distribution Learning Demonstrated on Disease Severity Prediction in Sugar Beet

arXiv.org Artificial Intelligence

Remote sensing and artificial intelligence are pivotal technologies of precision agriculture nowadays. The efficient retrieval of large-scale field imagery combined with machine learning techniques shows success in various tasks like phenotyping, weeding, cropping, and disease control. This work will introduce a machine learning framework for automatized large-scale plant-specific trait annotation for the use case disease severity scoring for Cercospora Leaf Spot (CLS) in sugar beet. With concepts of Deep Label Distribution Learning (DLDL), special loss functions, and a tailored model architecture, we develop an efficient Vision Transformer based model for disease severity scoring called SugarViT. One novelty in this work is the combination of remote sensing data with environmental parameters of the experimental sites for disease severity prediction. Although the model is evaluated on this special use case, it is held as generic as possible to also be applicable to various image-based classification and regression tasks. With our framework, it is even possible to learn models on multi-objective problems as we show by a pretraining on environmental metadata.


Combining Deep Learning and Street View Imagery to Map Smallholder Crop Types

arXiv.org Artificial Intelligence

Accurate crop type maps are an essential source of information for monitoring yield progress at scale, projecting global crop production, and planning effective policies. To date, however, crop type maps remain challenging to create in low and middle-income countries due to a lack of ground truth labels for training machine learning models. Field surveys are the gold standard in terms of accuracy but require an often-prohibitively large amount of time, money, and statistical capacity. In recent years, street-level imagery, such as Google Street View, KartaView, and Mapillary, has become available around the world. Such imagery contains rich information about crop types grown at particular locations and times. In this work, we develop an automated system to generate crop type ground references using deep learning and Google Street View imagery. The method efficiently curates a set of street view images containing crop fields, trains a model to predict crop type by utilizing weakly-labelled images from disparate out-of-domain sources, and combines predicted labels with remote sensing time series to create a wall-to-wall crop type map. We show that, in Thailand, the resulting country-wide map of rice, cassava, maize, and sugarcane achieves an accuracy of 93%. We publicly release the first-ever crop type map for all of Thailand 2022 at 10m-resolution with no gaps. To our knowledge, this is the first time a 10m-resolution, multi-crop map has been created for any smallholder country. As the availability of roadside imagery expands, our pipeline provides a way to map crop types at scale around the globe, especially in underserved smallholder regions.


GeoSAM: Fine-tuning SAM with Sparse and Dense Visual Prompting for Automated Segmentation of Mobility Infrastructure

arXiv.org Artificial Intelligence

The Segment Anything Model (SAM) has shown impressive performance when applied to natural image segmentation. However, it struggles with geographical images like aerial and satellite imagery, especially when segmenting mobility infrastructure including roads, sidewalks, and crosswalks. This inferior performance stems from the narrow features of these objects, their textures blending into the surroundings, and interference from objects like trees, buildings, vehicles, and pedestrians - all of which can disorient the model to produce inaccurate segmentation maps. To address these challenges, we propose Geographical SAM (GeoSAM), a novel SAM-based framework that implements a fine-tuning strategy using the dense visual prompt from zero-shot learning, and the sparse visual prompt from a pre-trained CNN segmentation model. The proposed GeoSAM outperforms existing approaches for geographical image segmentation, specifically by 26%, 7%, and 17% for road infrastructure, pedestrian infrastructure, and on average, respectively, representing a momentous leap in leveraging foundation models to segment mobility infrastructure including both road and pedestrian infrastructure in geographical images. The source code can be found on this GitHub repository: https://github.com/rafiibnsultan/GeoSAM/tree/main.


Bridging the Domain Gap: A Simple Domain Matching Method for Reference-based Image Super-Resolution in Remote Sensing

arXiv.org Artificial Intelligence

Recently, reference-based image super-resolution (RefSR) has shown excellent performance in image super-resolution (SR) tasks. The main idea of RefSR is to utilize additional information from the reference (Ref) image to recover the high-frequency components in low-resolution (LR) images. By transferring relevant textures through feature matching, RefSR models outperform existing single image super-resolution (SISR) models. However, their performance significantly declines when a domain gap between Ref and LR images exists, which often occurs in real-world scenarios, such as satellite imaging. In this letter, we introduce a Domain Matching (DM) module that can be seamlessly integrated with existing RefSR models to enhance their performance in a plug-and-play manner. To the best of our knowledge, we are the first to explore Domain Matching-based RefSR in remote sensing image processing. Our analysis reveals that their domain gaps often occur in different satellites, and our model effectively addresses these challenges, whereas existing models struggle. Our experiments demonstrate that the proposed DM module improves SR performance both qualitatively and quantitatively for remote sensing super-resolution tasks.


Stitching Satellites to the Edge: Pervasive and Efficient Federated LEO Satellite Learning

arXiv.org Artificial Intelligence

In the ambitious realm of space AI, the integration of federated learning (FL) with low Earth orbit (LEO) satellite constellations holds immense promise. However, many challenges persist in terms of feasibility, learning efficiency, and convergence. These hurdles stem from the bottleneck in communication, characterized by sporadic and irregular connectivity between LEO satellites and ground stations, coupled with the limited computation capability of satellite edge computing (SEC). This paper proposes a novel FL-SEC framework that empowers LEO satellites to execute large-scale machine learning (ML) tasks onboard efficiently. Its key components include i) personalized learning via divide-and-conquer, which identifies and eliminates redundant satellite images and converts complex multi-class classification problems to simple binary classification, enabling rapid and energy-efficient training of lightweight ML models suitable for IoT/edge devices on satellites; ii) orbital model retraining, which generates an aggregated "orbital model" per orbit and retrains it before sending to the ground station, significantly reducing the required communication rounds. We conducted experiments using Jetson Nano, an edge device closely mimicking the limited compute on LEO satellites, and a real satellite dataset. The results underscore the effectiveness of our approach, highlighting SEC's ability to run lightweight ML models on real and high-resolution satellite imagery. Our approach dramatically reduces FL convergence time by nearly 30 times, and satellite energy consumption down to as low as 1.38 watts, all while maintaining an exceptional accuracy of up to 96%.


Biological Valuation Map of Flanders: A Sentinel-2 Imagery Analysis

arXiv.org Artificial Intelligence

In recent years, machine learning has become crucial in remote sensing analysis, particularly in the domain of Land-use/Land-cover (LULC). The synergy of machine learning and satellite imagery analysis has demonstrated significant productivity in this field, as evidenced by several studies. A notable challenge within this area is the semantic segmentation mapping of land usage over extensive territories, where the accessibility of accurate land-use data and the reliability of ground truth land-use labels pose significant difficulties. For example, providing a detailed and accurate pixel-wise labeled dataset of the Flanders region, a first-level administrative division of Belgium, can be particularly insightful. Yet there is a notable lack of regulated, formalized datasets and workflows for such studies in many regions globally. This paper introduces a comprehensive approach to addressing these gaps. We present a densely labeled ground truth map of Flanders paired with Sentinel-2 satellite imagery. Our methodology includes a formalized dataset division and sampling method, utilizing the topographic map layout 'Kaartbladversnijdingen,' and a detailed semantic segmentation model training pipeline. Preliminary benchmarking results are also provided to demonstrate the efficacy of our approach.


We used AI and satellite imagery to map ocean activities that take place out of sight, including fishing, shipping and energy development

AIHub

Humans are racing to harness the ocean's vast potential to power global economic growth. Worldwide, ocean-based industries such as fishing, shipping and energy production generate at least US 1.5 trillion in economic activity each year and support 31 million jobs. This value has been increasing exponentially over the past 50 years and is expected to double by 2030. Transparency in monitoring this "blue acceleration" is crucial to prevent environmental degradation, overexploitation of fisheries and marine resources, and lawless behavior such as illegal fishing and human trafficking. Open information also will make countries better able to manage vital ocean resources effectively. But the sheer size of the ocean has made tracking industrial activities at a broad scale impractical – until now.


Towards Global Glacier Mapping with Deep Learning and Open Earth Observation Data

arXiv.org Artificial Intelligence

Accurate global glacier mapping is critical for understanding climate change impacts. It is challenged by glacier diversity, difficult-to-classify debris and big data processing. Here we propose Glacier-VisionTransformer-U-Net (GlaViTU), a convolutional-transformer deep learning model, and five strategies for multitemporal global-scale glacier mapping using open satellite imagery. Assessing the spatial, temporal and cross-sensor generalisation shows that our best strategy achieves intersection over union >0.85 on previously unobserved images in most cases, which drops to >0.75 for debris-rich areas such as High-Mountain Asia and increases to >0.90 for regions dominated by clean ice. Additionally, adding synthetic aperture radar data, namely, backscatter and interferometric coherence, increases the accuracy in all regions where available. The calibrated confidence for glacier extents is reported making the predictions more reliable and interpretable. We also release a benchmark dataset that covers 9% of glaciers worldwide. Our results support efforts towards automated multitemporal and global glacier mapping.